WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C12N 15/40, C07K 14/18, C12Q 1/70, 
C07K 16/10, A61K 39/29 



A2 



(11) International Publicati n Number: WO 99/04008 

(43) International Publicati n Date: 28 January 1999 (28.01.99) 



(21) International Application Number: PCT/US98/ 14688 

(22) International Filing Date: 16 July 1998 (16.07.98) 



(30) Priority Data: 

60/053,062 
09/014,416 



18 July 1997 (18.07.97) US 
27 January 1998 (27.01.98) US 



(71) Applicant: THE GOVERNMENT OF THE UNITED STATES 

OF AMERICA, asrepresented by THE SECRETARY, 
DEPARTMENT OF HEALTH AND HUMAN SERVICES 
[US/US]; Office of Technology Transfer, National Institutes 
of Health, Suite 325, 6011 Executive Boulevard, Rockville, 
MD 20852 (US). 

(72) Inventors: YANAGI, Masayuki; 257 Congressional Lane, 

#402, Rockville, MD 20852 (US). BUKH, Jens; 7 Center 
Drive, MSC 0740, Bethesda, MD 20892 (US). EMERSON, 
Suzanne, U.; 18201 Woodcrest Drive, Rockville, MD 20852 
(US). PURCELL, Robert, H; 17517 White Grounds Road, 
Boyds, MD 20841 (US). 

(74) Agents: FEILER, William, S. et al.; Morgan & Finnegan, 
LLP., 345 Park Avenue, New York, NY 10154 (US). 



(81) Designated States: AL, AM, AT, AU, AZ, BA, BB, BG, BR, 
BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, GE, 
GH, GM, HR, HU, ID, IL, IS, JP, KE, KG, KP, KR, KZ, 
LC, LK, LR, LS, LT f LU, LV, MD, MG, MK, MN, MW, 
MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, SK, SL, TJ, 
TM, TR, TT, UA, UG, UZ, VN, YU t ZW, ARIPO patent 
(GH, GM t KE, LS, MW, SD, SZ, UG, ZW), Eurasian patent 
(AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European patent 
(AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, IE, IT, 
LU, MC, NL, PT, SE), OAPI patent (BF, BJ, CF, CG, CI, 
CM, GA, GN, GW, ML, MR, NE, SN, TD, TG). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Title: CLONED GENOMES OF INFECTIOUS HEPATITIS C VIRUSES AND USES THEREOF 



(57) Abstract 



The present invention discloses nucleic acid sequences which encode infectious hepatitis C viruses and the use of these sequences, 
and polypeptides encoded by all or part of these sequences, in the development of vaccines and diagnostics for HCV and in the development 
of screening assays for the identification of antiviral agents for HCV. 



JSOOCID: <WO 990400BA2 1 > 



FOR THE PURPOSES OF INFORMATION ONLY 
Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL Albania 

AM Armenia 

AT Austria 

AU Australia 

AZ Azerbaijan 

BA Bosnia and Herzegovina 

BB Barbados 

BE Belgium 

BF Burkina Paso 

BG Bulgaria 

BJ Benin 

BR Brazil 

BY Belarus 

CA Canada 

CF Central African Republic 

CG Congo 

CH Switzerland 

CI Cdte d'lvoire 

CM Cameroon 

CN China 

CU Cuba 

CZ Czech Republic 

DE Germany 

DK Denmark 

EE Estonia 



ES 


Spain 


LS 


Lesotho 


SI 


FI 


Finland 


LT 


Lithuania 


SK 


FR 


France 


LU 


Luxembourg 


SN 


GA 


Gabon 


LV 


Latvia 


sz 


GB 


United Kingdom 


MC 


Monaco 


TD 


GE 


Georgia 


MD 


Republic of Moldova 


TG 


GH 


Ghana 


MG 


Madagascar 


TJ 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


GR 


Greece 




Republic of Macedonia 


TR 


HU 


Hungary 


ML 


Mali 


TT 


IE 


Ireland 


MN 


Mongolia 


UA 


IL 


Israel 


MR 


Mauritania 


UG 


IS 


Iceland 


MW 


Malawi 


US 


IT 


Italy 


MX 


Mexico 


UZ 


JP 


Japan 


NE 


Niger 


VN 


KE 


Kenya 


NL 


Netherlands 


YU 


KG 


Kyrgyzstan 


NO 


Norway 


ZW 


KP 


Democratic People's 


NZ 


New Zealand 






Republic of Korea 


PL 


Poland 




KR 


Republic of Korea 


PT 


Portugal 




KZ 


Kazakstan 


RO 


Romania 




LC 


Saint Lucia 


RU 


Russian Federation 




LI 


Liechtenstein 


SD 


Sudan 




LK 


Sri Lanka 


SE 


Sweden 




LR 


Liberia 


SG 


Singapore 





Slovenia 

Slovakia 

Senegal 

Swaziland 

Chad 

Togo 

Tajikistan 

Turkmenistan 

Turkey 

Trinidad and Tobago 

Ukraine 

Uganda 

United States of America 

Uzbekistan 

Viet Nam 

Yugoslavia 

Zimbabwe 



-DOCID; <WO 990400SA2J_> 



WO 99/04008 PCT/US98/14688 



10 



15 



20 



25 



30 



35 



- 1 



Title Of Invention 

Cloned Genomes Of Infectious 
Hepatitis C Viruses And Uses Thereof 

This application claims the benefit of U.S. 
Provisional Application No. 60/053,062 filed July 18, 
1997. 

Field Of Invention 

The present invention relates to molecular 
approaches to the production of nucleic acid sequences 
which comprise the genome of infectious hepatitis C 
viruses. In particular, the invention provides nucleic 
acid sequences which comprise the genomes of infectious 
hepatitis C viruses of genotype la and lb strains. The 
invention therefore relates to the use of these sequences, 
and polypeptides encoded by all or part of these 
sequences, in the development of vaccines and diagnostic' 
assays for HCV and in the development of screening assays 
for the identification of antiviral agents for HCV. 

Background Of Invention 

Hepatitis C virus (HCV) has a positive-sense 
single- strand RNA genome and is a member of the virus 
family Flaviviridae (Choo et al., 1991; Rice, 1996). As 
for all positive -stranded RNA viruses, the genome of HCV 
functions as mRNA from which all viral proteins necessary 
for propagation are translated. 

The viral genome of HCV is approximately 9600 
nucleotides (nts) and consists of a highly conserved 5' 
untranslated region (UTR) , a single long open reading 



WO 99/04008 



PCT/US98/14688 



10 



- 2 



frame (ORF) of approximately 9,000 nts and a complex 3' 
UTR. The 5' UTR contains an internal ribosomal entry site 
(Tsukiyama-Kohara et al . , 1992; Honda et al . , 1996). The 
3' UTR consists of a short variable region, a 
polypyrimidine tract of variable length and, at the 3' 
end, a highly conserved region of approximately 100 nts 
(Kolykhalov et al . , 1996; Tanaka et al . , 1995; Tanaka et 
al., 1996; Yamada et al . , 1996). The last 46 nucleotides 
of this conserved region were predicted to form a stable 
stem- loop structure thought to be critical for viral 
replication (Blight and Rice, 1997; Ito and Lai, 1997; 
Tsuchihara et al . , 1997). The ORF encodes a large 
15 polypeptide precursor that is cleaved into at least 10 

proteins by host and viral proteinases (Rice, 1996) . The 
predicted envelope proteins contain several conserved N- 
linked glycosylation sites and cysteine residues (Okamoto 
et al., 1992a). The NS3 gene encodes a serine protease 

20 

and an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase . 

Globally, six major HCV genotypes (genotypes 1- 
6) and multiple subtypes (a, b, c, etc.) have been 
25 identified (Bukh et al . , 1993; Simmonds et al . , 1993). 

The most divergent HCV isolates differ from each other by 
more than 3 0% over the entire genome (Okamoto et al . , 
1992a) and HCV circulates in an infected individual as a 
quasispecies of closely related genomes (Bukh et al., 
1995; Farci et al . , 1997). 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 
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risk of developing chronic hepatitis, liver cirrhosis and 
hepatocellular carcinoma (Hoofnagle, 1997). In the U.S., 
HCV genotypes la and lb constitute the majority of 
infections while in many other areas, especially in Europe 
and Japan, genotype lb predominates. 

The only effective therapy for chronic hepatitis 
C, interferon (IFN) , induces a sustained response in less 
than 25% of treated patients (Fried and Hoofnagle, 1995) . 
10 Consequently, HCV .is currently the most common cause of 
end stage liver failure and the reason for about 30% of 
liver transplants performed in the U.S. (Hoofnagle, 1997) . 
In addition, a number of recent studies suggested that the 
severity of liver disease and the outcome of therapy may 
be genotype -dependent (reviewed in Bukh et al . , 1997). In 
particular, these studies suggested that infection with 
HCV genotype lb was associated with more severe liver 
disease (Brechot, 1997) and a poorer response to IFN 
20 therapy (Fried and Hoofnagle, 19 95) . As a result of the 
inability to develop a universally effective therapy 
against HCV infection, it is estimated that there are 
still more than 25,000 new infections yearly in the U.S. 
(Alter 1997) Moreover, since there is no vaccine for HCV, 
HCV remains a serious public health problem. 

However, despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell culture 
30 system and the lack of any small animal model for 

laboratory study. For example, while replication of HCV 
in several cell lines has been reported, such observations 
have turned out not to be highly reproducible. In 
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addition, the chimpanzee is the only animal model, other 
than man, for this disease. Consequently, HCV has been 
able to be studied only by using clinical materials 
obtained from patients or experimentally infected 
5 chimpanzees (an animal model whose availability is very 
limited) . 

However, several researchers have recently 
reported the construction of infectious cDNA clones of 

10 HCV, the identification of which would permit a more 

effective search for susceptible cell lines and facilitate 
molecular analysis of the viral genes and their function. 
For example, Dash et al . , (1997) and Yoo et al., (1995) 
reported that RNA transcripts from cDNA clones of HCV-1 

15 (genotype la) and HCV-N (genotype lb) , respectively, 

resulted in viral replication after transfection into 
human hepatoma cell lines. Unfortunately, the viability 
of these clones was not tested in vivo and concerns were 

20 raised about the infectivity of these cDNA clones in vitro 
(Fausto, 1997) . In addition, both clones did not contain 
the terminal 98 conserved nucleotides at the very 3' end 
of the UTR. 

25 Kolykhalov et al . , (1997) and Yanagi et al. 

(1997) reported the derivation from HCV strain H77 (which 
is genotype la) of cDNA clones of HCV that are infectious 
for chimpanzees. However, while these infectious clones 
will aid in studying HCV replication and pathogenesis and 

30 will provide an important tool for development of in vitro 
replication and propagation systems, it is important to 
have infectious clones of more than one genotype given the 
extensive genetic heterogeneity of HCV and the potential 

35 
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effective therapies and vaccines for HCV. 

Summary Of The Invention 

5 

The present invention relates to nucleic acid 
sequences which comprise the genome of infectious 
hepatitis C viruses and in particular, nucleic acid 
sequences which comprise the genome of infectious 
10 hepatitis C viruses of genotype la and lb strains. It is 
therefore an object of the invention to provide nucleic 
acid sequences which encode infectious hepatitis C 
viruses. Such nucleic acid sequences are referred to 
throughout the application as "infectious nucleic acid 
sequences " . 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA, cDNA or any variant 
thereof capable of directing host organism synthesis of a 
20 hepatitis C virus polypeptide. It is understood that 

nucleic acid sequence encompasses nucleic acid sequences, 
which due to degeneracy, encode the same polypeptide 
sequence as the nucleic acid sequences described herein.^ 

The invention also relates to the use of the 
infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading frames 
of infectious nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 
and 6) and subtypes (including, but not limited to, 
subtypes la, lb, 2a, 2b, 2c, 3a 4a-4f, 5a and 6a) of HCV. 
For example infectious nucleic acid sequence of the la and 
lb strains H77 and HC-J4, respectively, described herein 
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can be used to produce chimeras with sequences from the 
genomes of other strains of HCV from different genotypes 
or subtypes. Nucleic acid sequences which comprise 
sequence from the open- reading frames of 2 or more HCV 
^ genotypes or subtypes are designated "chimeric nucleic 
acid sequences". 

The invention further relates to mutations of 
the infectious nucleic acid sequences of the invention 
10 where mutation includes, but is not limited to, point 

mutations, deletions and insertions. In one embodiment, a 
gene or fragment thereof can be deleted to determine the 
effect of the deleted gene or genes on the properties of 
the encoded virus such as its virulence and its ability to 
15 replicate. In an alternative embodiment, a mutation may 
be introduced into the infectious nucleic acid sequences 
to examine the effect of the mutation on the properties of 
the virus in the host cell. 
20 The invention also relates to the introduction 

of mutations or deletions into the infectious nucleic acid 
sequences in order to produce an attenuated hepatitis C 
virus suitable for vaccine development. 

The invention further relates to the use of the 
infectious nucleic acid sequences to produce attenuated 
viruses via passage in vitro or in vivo of the viruses 
produced by transfection of a host cell with the 
infectious nucleic acid sequence. 
30 The present invention also relates to the use of 

the nucleic acid sequences of the invention or fragments 
thereof in the production of polypeptides where "nucleic 
acid sequences of the invention" refers to infectious 

35 
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nucleic acid sequences, mutations of infectious nucleic 
acid sequences, chimeric nucleic acid sequences and 
sequences which comprise the genome of attenuated viruses 
produced from the infectious nucleic acid sequences of the 
5 invention. The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the development 
of diagnostic assays for detecting the presence of HCV in 

10 biological samples. 

The invention therefore also relates to vaccines 
for use in immunizing mammals especially humans against 
hepatitis C. In one embodiment, the vaccine comprises one 
or more polypeptides made from a nucleic acid sequence of 

^ the invention or fragment thereof. In a second 

embodiment, the vaccine comprises a hepatitis C virus 
produced by transfection of host cells with the nucleic 
acid sequences of the invention. 

20 The present invention therefore relates to 

methods for preventing hepatitis C in a mammal. In one 
embodiment the method comprises administering to a mammal 
a polypeptide or polypeptides encoded by a nucleic acid 
secruence of the invention in an amount effective to induce 
protective immunity to hepatxtis C. In another 
embodiment, the method of prevention comprises 
administering to a mammal a hepatitis C virus of the 
invention in an amount effective to induce protective 

30 immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal a nucleic 
acid sequence of the invention or a fragment thereof in an 

35 
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amount effective to induce protective immunity against 
hepatitis C. 

The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
^ nucleic acid sequences of the present invention. 

The invention therefore also provides 
pharmaceutical compositions comprising the nucleic acid 
sequences of the invention and/or their encoded hepatitis 
10 C viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequences of the invention or fragments 
thereof. The pharmaceutical compositions of the invention 
may be used prophylactically or therapeutically. 

The invention also relates to antibodies to the 
hepatitis C viruses of the invention or their encoded 
polypeptides and to pharmaceutical compositions comprising 
these antibodies. 
20 The present invention further relates to 

polypeptides encoded by the nucleic acid sequences of the 
invention fragments thereof. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
purified from hepatitis C virus produced by cells 
transfected with nucleic acid sequence of the invention. 
In another embodiment, the polypeptide or polypeptides are 
produced recombinant ly from a fragment of the nucleic acid 
sequences of the invention. In yet another embodiment, 
30 the polypeptides are chemically synthesized. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
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lines capable of supporting the replication of HCV in 
vitro . 

The invention further relates to the use of the 
nucleic acid sequences of the invention or their encoded 
5 proteases (e.g. NS3 protease) to develop screening assays 
to identify antiviral agents for HCV. 

Brief Description Of Figures 

10 ;r Figure 1 shows a strategy for constructing full- 

length cDNA clones of HCV strain H77. The long PCR 
products amplified with HI and H9417R primers were cloned 
directly into pGEM-9zf (-) after digestion with Not I and 
Xba I (pH21j and pH50 x ) . Next, the 3' UTR was cloned into 
both pH21j and pH50 x after digestion with Afl II and Xba" I 
(pH21 and pH50) . pH21 was tested for infectivity in a 
chimpanzee. To improve the efficiency of cloning, we 
constructed a cassette vector with consensus 5' and 3' 
termini of H77. This cassette vector (pCV) was obtained 
by cutting out the BamHI fragment (nts 13 58- 753 0 of the 
H77 genome) from pH50, followed by religation. Finally, 
the long PCR products of H77 amplified with primers HI and 
H9417R (H product) or primers Al and H9417R (A product) 
were cloned into pCV after digestion with Age I and Afl II 
or with Pin AI and Bfr I. The latter procedure yielded 
multiple complete cDNA clones of strain H77 of HCV. 

Figure 2 shows the results of gel 
electrophoresis of long RT-PCR amplicons of the entire ORF 
of H77 and the transcription mixture of the infectious 
clone of H77. The complete ORF was amplified by long RT- 
PCR with the primers. HI or Al and H9417R from 10 5 GE of 
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H77. A total of 10 fig of the consensus chimeric clone 
(pCV-H77C> linearized with Xba I was transcribed in a 100 
111 reaction with T7 RNA polymerase . Five pi of the 
transcription mixture was analyzed by gel electrophoresis 
^ and the remainder of the mixture was injected into a 

chimpanzee. Lane 1, molecular weight marker ; lane 2, 
products amplified with primers HI and H9417R; lane 3, 
products amplified with primers Al and H9417R; lane 4, 

10 transcription mixture containing the RNA transcripts and 
linearized clone pCV-H77C (12.5 kb) . 

Figure 3 is a diagram of the genome organization 
of HCV strain H77 and the genetic heterogeneity of 
individual full-length clones compared with the consensus 

^ sequence of H77 . Solid lines represent aa changes. 

Dashed lines represent silent mutations. A * in pH21 
represents a point mutation at nt 58 in the 5' UTR. In 
the ORF, the consensus chimeric clone pCV-H77C had 11 nt 

20 differences [at positions 1625 (C->T) , 2709 (T-»C> , 3380 

(A-»G) , 3710 <C-»T) , 3914 (G->A) , 4463 (T->C) , 5058 (C-»T) , 
5834 <C->T) , 6734 (T-»C) , 7154 (C-»T) , and 7202 (T-*C) ] and 
one aa change (F -> L at aa 790) compared with the 

2 ^ consensus sequence of H77 . This clone was infectious. 

Clone pH21 and pCV-Hll had 19 nts (7 aa) and 64 nts (21 
aa) differences respectively, compared with the consensus 
sequence of H77. These two clones were not infectious. A 

30 single point mutation in the 3' UTR at nucleotide 94 06 

(G-»A) introduced to create an Afl II cleavage site is not 
shown . 
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Figures 4A-4F show the complete nucleotide 
sequence of a H77C clone produced according to the present 
invention and Figures 4G-4H show the amino acid sequence 
encoded by the H77C clone. 
5 Figure 5 shows an agarose gel of long RT-PCR 

amplicons and transcription mixtures. Lanes 1 and 4: 
Molecular weight marker ( Lambda / Hindi I I digest) . Lanes 2 
and 3: RT-PCR amplicons of the entire ORF of HC- J4 . Lane 
10 5: pCV-H77C transcription control (Yanagi et al . , 1997). 

Lanes 6, 7, and 8: 1/40 of each transcription mixture of 
pCV-J4L2S, pCV-J4L4S and pCV-J4L6S, respectively, which 
was injected into the chimpanzee. 

Figure 6 shows the strategy utilized for the < 
construction of full-length cDNA clones of HCV strain HC- 
J4 . The long PCR products were cloned as two separate 
fragments (L and S) into a cassette vector (pCV) with 
fixed 5' and 3' termini of HCV (Yanagi et al . , 1997). 
20 Full-length cDNA clones of HC-J4 were obtained by 

inserting the L fragment from three pCV-J4L clones into 
three identical pCV-J4S9 clones after digestion with 
PinAI (isoschizomer of Agel) and Bfrl (isoschizomer of 
Aflll) . 

Figure 7 shows amino acid positions with a 
quasispecies of HC-J4 in the acute phase plasma pool 
obtained from an experimentally infected chimpanzee. 
Cons-p9: consensus amino acid sequence deduced from 
30 analysis of nine L fragments and nine S fragments (see 

Fig. 6) . Cons-D: consensus sequence derived from direct 
sequencing of the PCR product. A, B, C: groups of similar 
viral species. Dot: amino acid identical to that in Cons- 

35 



25 



JSDOCID: <WO S904008A? ■ * 



WO 99/04008 



PCT/US98/14688 



- 12 - 

o 

p9. Capital letter: amino acid different from that in 
Cons-p9. Cons-F: composite consensus amino acid sequence 
combining Cons-p9 and Cons-D. Boxed amino acid: different 
from that in Cons-F. Shaded amino acid: different from 
5 that in all species A sequences. An *: defective ORF due 
to a nucleotide deletion (clone LI, aa 1097) or insertion 
(clone L7, aa 2770). Diagonal lines: fragments used to 
construct the infectious clone. 

IQ Figure 8 shows comparisons (percent difference) 

of nucleotide (nts. 156 - 8935) and predicted amino acid 
sequences (aa 1 - 2864) of L clones (species A, B, and C, 
this study), HC-J4/91 (Okamoto et al . , 1992b) and HC-J4/83 
(Okamoto et al . , 1992b). Differences among species A 

^ sequences and among species B sequences are shaded. 

Figure 9 shows UPGMA ("unweighted pair group 
method with arithmetic mean") trees of HC-J4/91 (Okamoto 
et al., 1992b) , HC-J4/83 (Okamoto et al . , 1992b), two 

20 prototype strains of genotype lb (HCV-J, Kato et al., 

1990; HCV-BK, Takamizawa et al . , 1991), and L clones (this 
study) . 

Figure 10 shows the alignment of the HVR1 and 

HVR2 amino acid sequences of the E2 sequences of nine L 

25 . „ 

clones of HC-J4 (species A, B, and C) obtained from an 

early acute phase plasma pool of an experimentally 

infected chimpanzee compared with the sequences of eight 

clones (HC-J4/91-20 through HC-J4/91-27, Okamoto et al . , 

30 1992b) derived from the inoculum. Dot: an amino acid 

identical to that in the top line. Capital letters: amino 

acid different from that in the top line. 
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Figure 11 shows the alignment of the 5' UTR and 
the 3' UTR sequences of infectious clones of genotype la 
(pCV-H77C) and lb <pCV-J4L6S) . Top line: consensus 
sequence of the indicated strain. Dot: identity with 
consensus sequence. Capital letter: different from the 
consensus sequence. Dash: deletion. Underlined: PinAI 
and Bfrl cleavage site. Numbering corresponds to the HCV 
sequence of pCV-J4L6S. 
10 Figure 12 shows a comparison of individual full- 

length cDNA clones of the ORF of HCV strain HC-J4 with 
the consensus sequence (see Fig. 7). Solid lines: amino 
acid changes. Dashed lines: silent mutations. Clone pCV- 
J4L6S was infectious in vivo whereas clones pCV-J4L2S and 
pCV-J4L4S were not. 

Figure 13 shows biochemical (ALT levels) and PCR 
analyses of a chimpanzee following percutaneous 
intrahepatic transfection with RNA transcripts of the 
infectious clone of pCV-J4L2S, pCV-J4L4S and pCV-J4L6S." 
The ALT serum enzyme levels were measured in units per 
liter (u/1) . For the PCR analysis, "HCV RNA" represented 
by an open rectangle indicates a serum sample that was 
negative for HCV after nested PCR; "HCV RNA" represented 
by a closed rectangle indicates that the serum sample was 
positive for HCV and HCV GE titer on the right-hand y-axis 
represents genome equivalents. 

Figures 14A-14F show the nucleotide sequence of 
the infectious clone of genotype lb strain HC-J4 and 
Figures 14G-14H show the amino acid sequence encoded by 
the HC-J4 clone. 
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Figure 15 shows the strategy for constructing a 
chimeric HCV clone designated pH77CV-J4 which contains the 
nonstructural region of the infectious clone of genotype 
la strain H77 and the structural region of the infectious 
^ clone of genotype lb strain HC-J4. 

Figures 16A-16F show the nucleotide sequence of 
the chimeric la/lb clone pH77CV-J4 of Figure 15 and 
Figures 16G-16H show the amino acid sequence encoded by 
10 the chimeric la/lb clone. 

Figures 17A and 17B show the sequence of the 3' 
untranslated region remaining in various 3' deletion 
mutants of the la infectious clone pCV-H77C and the 
strategy utilized in constructing each 3' deletion mutant 
(Figures 17C-17G) . 

Of the seven deletion mutants shown, two (pCV- 
H77C(-98X) and (-42X) ) have been constructed and tested 
for infectivity in chimpanzees (see Figures 17A and 17C) 
and the other six are to be constructed and tested for 
infectivity as described in Figures 17D-17G. 

Figures 18A and 18B show biochemical (ALT 
levels) , PCR (HCV RNA and HCV GE titer) , serological 
25 ( ant i -HCV) and histopathological (Fig. 18B only) analyses 

of chimpanzees 1494 (Fig. 18A) and 1530 (Fig. 18B) 
following transfection with the infectious cDNA clone pCV- 
H77C. 

The ALT serum enzyme levels were measured in 
units per ml (u/1) . For the PCR analysis, "HCV RNA" 
represented by an open rectangle indicates a serum sample 
that was negative for HCV after nested PCR; "HCV RNA" 
represented by a closed rectangle indicates that the serum 
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sample was positive for HCV; and HCV GE titer on the 
right-hand y-axis represents genome equivalents. 

The bar marked " ant i -HCV" indicates samples that 
were positive for anti-HCV antibodies as determined by 
5 commercial assays. The histopathology scores in Figure 
18B correspond to no histopathology (O) , mild hepatitis 
((f) and moderate to severe hepatitis (•) . 

DESCRIPTION OF THE INVENTION 

10 

The present invention relates to nucleic acid 
sequences which comprise the genome of an infectious 
hepatitis C virus. More specifically, the invention 
relates to nucleic acid sequences which encode infectious 
hepatitis C viruses of genotype la and lb strains. In one 
embodiment, the infectious nucleic acid sequence of the 
invention has the sequence shown in Figures 4A-4F of this 
application. In another embodiment, the infectious 

20 nucleic acid sequence has the sequence shown in Figures 

14A-14F and is contained in a plasmid construct deposited 
with the American Type Culture Collection (ATCC) on 
January 26, 1998 and having ATCC accession number 209596. 

25 The invention also relates to "chimeric nucleic 

acid sequences" where the chimeric nucleic acid sequences 
consist of open-reading frame sequences taken from 
infectious nucleic acid sequences of hepatitis C viruses 
of different genotypes or subtypes. 

In one embodiment, the chimeric nucleic acxd 
sequence consists of sequence from the genome of an HCV 
strain belonging to one genotype or subtype which encodes 
structural polypeptides and sequence of an HCV strain 

35 
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belonging to another genotype strain or subtype which 
encodes nonstructural polypeptides. Such chimeras can be 
produced by standard techniques of restriction digestion, 
PGR amplification and subcloning known to those of 
^ ordinary skill in the art. 

In a preferred embodiment, the sequence encoding 
nonstructural polypeptides is from an infectious nucleic 
acid sequence encoding a genotype la strain where the 
10 construction of a chimeric la/lb nucleic acid sequence is 
described in Example 9 and the chimeric la/lb nucleic acid 
sequence is shown in Figures 16A-16F. It is believed that 
the construction of such chimeric nucleic acid sequences 
will be of importance in studying the growth and virulence 
properties of hepatitis C virus and in the production of 
hepatitis C viruses suitable to confer protection against 
multiple genotypes of HCV. For example, one might produce 
a "multivalent 11 vaccine by putting epitopes from several 
20 genotypes or subtypes into one clone. Alternatively one 
might replace just a single gene from an infectious 
sequence with the corresponding gene from the genomic 
sequence Q f a strain from another genotype or subtype or 
create a chimeric gene which contains portions of a gene 
from two genotypes or subtypes. Examples of genes which 
could be replaced or which could be made chimeric, 
include, but are not limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
30 the infectious nucleic acid sequences where "mutations" 
includes, but is not limited to, point mutations, 
deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 
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insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
within the virion. Such mutation could be produced by 
techniques known to those of skill in the art such as 
^ site-directed mutagenesis, fusion PCR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
undertaken to determine sequences that are important for 
10 viral properties such as replication or virulence. For 

example, one may introduce a mutation into the infectious 
nucleic acid sequence which eliminates the cleavage site 
between the NS4A and NS4B polypeptides to examine the 
effects on viral replication and processing of the 
polypeptide. Alternatively, one or more of the 3 amino 
acids encoded by the infectious lb nucleic acid sequence 
shown in Figures 14A-14F which differ from the HC-J4 
consensus sequence may be back mutated to the 
20 corresponding amino acid in the HC-J4 consensus sequence 
to determine the importance of these three amino acid 
changes to infectivity or virulence. In yet another 
embodiment, one or more of the amino acids from the 
noninfectious lb clones pCV-J4L2S and pCV-J4L4S which 
differ from the consensus sequence may be introduced into 
the infectious lb sequence shown in Figures 14A-14F. 

In yet another example, one may delete all or 
part of a gene or of the 5' or 3' nontranslated region 
3® contained in an infectious nucleic acid sequence and then 
transfect a host cell (animal or cell culture) with the 
mutated sequence and measure viral replication in the host 
by methods known in the art such as RT-PCR. Preferred 
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genes include, but are not limited to, the P7, NS4B and 
NS5A genes. Of course, those of ordinary skill in the art 
will understand that deletion of part of a gene, 
preferably the central portion of the gene, may be 
^ preferable to deletion of the entire gene in order to 

conserve the cleavage site boundaries which exist between 
proteins in the HCV polyprotein and which are necessary 
for proper processing of the polyprotein. 
10 In the alternative, if the transfection is into 

a host animal such as a chimpanzee, one can monitor the 
virulence phenotype of the virus produced by transfection 
of the mutated infectious nucleic acid sequence by methods 
known in the art such as measurement of liver enzyme 

15 

levels (alanine aminotransferase (ALT) or isocxtrate 
dehydrogenase (ICD) ) or by histopathology of liver 
biopsies. Thus, mutations of the infectious nucleic acid 
sequences may be useful in the production of attenuated 
20 HCV strains suitable for vaccine use. 

The invention also relates to the use of the 
infectious nucleic acid sequences of the present invention 
to produce attenuated viral strains via passage in vitro 
or in vivo of the virus produced by transfection with the 

25 

infectious nucleic acid sequences. 

The present invention therefore relates to the 
use of the nucleic acid sequences of the invention to 
identify cell lines capable of supporting the replication 
30 of HCV. 

In particular, it is contemplated that the 
mutations of the infectious nucleic acid sequences of the 
invention and the production of chimeric sequences as 



ISOOCID: <WO 9904O08A£J_> 



WO 99/04008 



PC17US98/14688 



- 19 - 

o 

discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, may 
be useful in identifying cell lines capable of supporting 
HCV replication. 
^ Transfection of tissue culture cells with the 

nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
electroporation, precipitation with DEAE-Dextran or 

10 calcium phosphate or liposomes. 

In one such embodiment, the method comprises the 
growing of animal cells, especially human cells, in vitro 
and transfecting the cells with the nucleic acid of the 
invention, then determining if the cells show indicia of 

15 HCV infection. Such indicia include the detection of 
viral antigens in the cell, for example, by 
immunof luorescent procedures well known in the art; the 
detection of viral polypeptides by Western blotting using 

20 antibodies specific therefor; and the detection of newly 

transcribed viral RNA within the cells via methods such as 
RT-PCR. The presence of live, infectious virus particles 
following such tests may also be shown by injection of 
cell culture medium or cell lysates into healthy, 

25 

susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and hepatocyte 

30 

cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 
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chimpanzees. In addition, various immortalization methods 
known to those of ordinary skill in the art can be used to 
obtain cell- lines derived from hepatocyte cultures. For 
example, primary hepatocyte cultures may be fused to a 
^ variety of cells to maintain stability. 

The present invention further relates to the in 
vitro and in vivo production of hepatitis C viruses from 
the nucleic acid sequences of the invention. 
10 In one embodiment, the sequences of the 

invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary skill 

15 

in the art and include, but are not limited to, plasmids, 
vaccinia viruses, retroviruses, adenoviruses and adeno- 
associated viruses. 

In another embodiment, the sequences contained 

20 in the recombinant expression vector can be transcribed in 
vitro by methods known to those of ordinary skill in the 
art in order to produce RNA transcripts which encode the 
hepatitis C viruses of the invention. The hepatitis C 
viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts (see Example 4) or 
with the recombinant expression vectors containing the 

30 nucleic acid sequences described herein. 

The present invention also relates to the 
construction of cassette vectors useful in the cloning of 
viral genomes wherein said vectors comprise a nucleic acid 
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sequence to be cloned, and said vector reading in the 
correct phase for the expression of the viral nucleic acid 
to be cloned. Such a cassette vector will, of course, 
also possess a promoter sequence, advantageously placed 
upstream of the sequence to be expressed. Cassette 
vectors according to the present invention are constructed 
according to the procedure described in Figure 1, for 
example, starting with plasmid pCV. Of course, the DNA to 
be inserted into said cassette vector can be derived from 
any virus, advantageously from HCV, and most 
advantageously from the H77 strain of HCV. The nucleic 
acid to be inserted according to the present invention 
can, for example, contain one or more open reading frames 
of the virus, for example, HCV. The cassette vectors of 
the present invention may also contain, optionally, one or 
more expressible marker genes for expression as an 
indication of successful transfection and expression of 
20 the nucleic acid sequences of the vector. To insure — 

expression, the cassette vectors of the present invention 
will contain a promoter sequence for binding of the 
appropriate cellular RNA polymerase, which will depend on 
the cell into which the vector has been introduced. For 
example, if the host cell is a bacterial cell, then said 
promoter will be a bacterial promoter sequence to which 
the bacterial RNA polymerases will bind. 

The hepatitis C viruses produced from the 
30 sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
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their use as immunogens in the pharmaceutical compositions 
and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 

5 

acid sequences of the invention as immunogens in live or 
killed ( e.g. , formalin inactivated) vaccines to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
10 the present invention may be an infectious nucleic acid 
sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a hepatitis 
C virus. Where the sequence is a cDNA sequence, the cDNAs 
and their RNA transcripts may be used to transfect a 
mammal by direct injection into the liver tissue of the 
mammal as described in the Examples. 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic expression 
20 vector containing a nucleic acid sequence of the 
invention. 

In yet another embodiment, the immunogen may be 
a polypeptide encoded by the nucleic acid sequences of the 
invention. The present invention therefore also relates 

25 

to polypeptides produced from the nucleic acid sequences 
of the invention or fragments thereof. In one embodiment, 
polypeptides of the present invention can be recombinantly 
produced by synthesis from the nucleic acid sequences of 
30 the invention or isolated fragments thereof, and purified, 
or partially purified, from transfected cells using 
methods already known in the art. In an alternative 
embodiment, the polypeptides may be purified or partially 
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purified from viral particles produced via transfection of 
a host cell with the nucleic acid sequences of the 
invention. Such polypeptides might, for example, include 
either capsid or envelope polypeptides prepared from the 
^ sequences of the present invention. 

When used as immunogens, the nucleic acid 
sequences of the invention, or the polypeptides or viruses 
produced therefrom, are preferably partially purified 
10 prior to use as immunogens in pharmaceutical compositions 
and vaccines of the present invention. When used as a 
vaccine, the sequences and the polypeptide and virus 
products thereof, can be administered alone or in a 
suitable diluent, including, but not limited to, water, - 
15 saline, or some type of buffered medium. The vaccine 

according to the present invention may be administered to 
an animal, especially a mammal, and most especially a 
human, by a variety of routes, including, but not limited 
20 to, intradermally, intramuscularly, subcutaneously , or- in 
any combination thereof. 

Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary depending 
on the route selected and the immunogen (nucleic acid, 
virus, polypeptide) administered. One skilled in the art 
will appreciate that the amounts to be administered for 
any particular treatment protocol can be readily 
determined without undue experimentation. The vaccines of 
30 the present invention may be administered once or 

periodically until a suitable titer of anti-HCV antibodies 
appear in the blood. For an immunogen consisting of a 
nucleic acid sequence, a suitable amount of nucleic acid 
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sequence to be used for prophylactic purposes might be 
expected to fall in the range of from about 100 |ig to 
about 5 mg and most preferably in the range of from about 
500 \ig to about 2mg. For a polypeptide, a suitable amount 
to use for prophylactic purposes is preferably 100 ng to 
100 ^ig and for a virus 10 2 to 10 6 infectious doses. Such 
administration will, of course, occur prior to any sign of 
HCV infection. 

A vaccine of the present invention may be 
employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or sterile 
liquid forms such as solutions or suspensions. Any inert 
15 carrier is preferably used, such as saline or phosphate- 
buffered saline, or any such carrier in which the HCV of 
the present invention can be suitably suspended. The 
vaccines may be in the form of single dose preparations or 
in multi-dose flasks which can be utilized for mass- 
20 vaccination programs of both animais and humans. Por 

purposes of using the vaccines of the present invention 
reference is made to Remington's Pharmaceutical Sciences , 
Mack Publishing Co., Easton, Pa., Osol (Ed.) (1980); and 
25 New Trends and Developments in Vaccines , Voller et al . 

(Eds.), University Park Press, Baltimore, Md. (1978), both 
of which provide much useful information for preparing and 
using vaccines. Of course, the polypeptides of the 
present invention, when used as vaccines, can include, as 
part of the composition or emulsion, a suitable adjuvant, 
such as alum (or aluminum hydroxide) when humans are to be 
vaccinated, to further stimulate production of antibodies 
by immune cells. When nucleic acids or viruses are used 
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for vaccination purposes, other specific adjuvants such as 
CpG motifs (Krieg, A. K. et al.(1995) and (1996)), may 
prove useful. 

When the nucleic acids, viruses and polypeptides 
of the present invention are used as vaccines or inocula, 
they will normally exist as physically discrete units 
suitable as a unitary dosage for animals, especially 
mammals, and most especially humans, wherein each unit 
will contain a predetermined quantity of active material 
calculated to produce the desired immunogenic effect in 
association with the required diluent. The dose of said 
vaccine or inoculum according to the present invention is 
administered at least once. In order to increase the 
antibody level, a second or booster dose may be 
administered at some time after the initial dose. The 
need for, and timing of, such booster dose will, of 
course, be determined within the sound judgment of the 
20 administrator of such vaccine or inoculum and according to 
sound principles well known in the art. For example, such 
booster dose coiild reasonably be expected to be 
advantageous at some time between about 2 weeks to about 6 
months following the initial vaccination. Subsequent 
doses may be administered as indicated. 

The nucleic acid sequences, viruses and 
polypeptides of the present invention can also be 
administered for purposes of therapy, where a mammal, 
30 especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 
polypeptides of the present invention are used for such 
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therapeutic purposes, much of the same criteria will apply- 
as when it is used as a vaccine, except that inoculation 
will occur post -infection. Thus, when the nucleic acid 
sequences, viruses or polypeptides of the present 
^ invention are used as therapeutic agents in the treatment 
of infection, the therapeutic agent comprises a 
pharmaceutical composition containing a sufficient amount 
of said nucleic acid sequences, viruses or polypeptides so 
10 as to elicit a therapeutically effective response in the 
organism to be treated. Of course, the amount of 
pharmaceutical composition to be administered will, as for 
vaccines, vary depending on the immunogen contained 
therein (nucleic acid, polypeptide, virus) and on the 
route of administration. 

The therapeutic agent according to the present 
invention can thus be administered by, subcutaneous, 
intramuscular or intradermal routes. One skilled in the 
20 art will certainly appreciate that the amounts to be 

administered for any particular treatment protocol can be 
readily determined without undue experimentation. Of 
course, the actual amounts will vary depending on the 
route of administration as well as the sex, age, and 
clinical status of the subject which, in the case of human 
patients, is to be determined with the sound judgment of 
the clinician. 

The therapeutic agent of the present invention 
can be employed in such forms as capsules, liquid 
solutions, suspensions or elixirs, or sterile liquid forms 
such as solutions or suspensions. Any inert carrier is 
preferably used, such as saline, phosphate-buf f ered 
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saline, or any such carrier in which the HCV of the 
present invention can be suitably suspended. The 
therapeutic agents may be in the form of single dose 
preparations or in the multi-dose flasks which can be 
^ utilized for mass -treatment programs of both animals and 
humans. Of course, when the nucleic acid sequences, 
viruses or polypeptides of the present invention are used 
as therapeutic agents they may be administered as a single 

10 dose or as a series of doses, depending on the situation 
as determined by the person conducting the treatment. 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term "antibody" 

^ is herein used to refer to immunoglobulin molecules and 
immunologically active portions of immunoglobulin 
molecules. Examples of antibody molecules are intact 
immunoglobulin molecules, substantially intact 

20 immunoglobulin molecules and portions of an immunoglobulin 
molecule, including those portions known in the art as 
Fab, F(ab') 2 and F(v) as well as chimeric antibody 
molecules . 

^ Thus, the polypeptides, viruses and nucleic acid 

sequences of the present invention can be used in the 
generation of antibodies that immunoreact (i.e., specific 
binding between an antigenic determinant -containing 
molecule and a molecule containing an antibody combining 

^ site such as a whole antibody molecule or an active 
portion thereof) with antigenic determinants on the 
surface of hepatitis C virus particles. 
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The present invention therefore also relates to 
antibodies produced following immunization with the 
nucleic acid sequences, viruses or polypeptides of the 
present invention- These antibodies are typically 
^ produced by immunizing a mammal with an immunogen or 
vaccine to induce antibody molecules having 
immunospecif icity for polypeptides or viruses produced in 
response to infection with the nucleic acid sequences of 
10 the present invention. When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. Antibodies 
^ produced according to the present invention have the 
unique advantage of being generated in response to 
authentic, functional polypeptides produced according to 
the actual cloned HCV genome. 
20 The antibody molecules of the present invention 

may be polyclonal or monoclonal. Monoclonal antibodies 
are readily produced by methods well known in the art. 
Portions of immunoglobin molecules, such as Fabs, as well 
as chimeric antibodies, may also be produced by methods 
well known to those of ordinary skill in the art of 
generating such antibodies. 

The antibodies according to the present 
invention may also be contained in blood plasma, serum, 
30 hybridoma supernatants , and the like. Alternatively, the 
antibody of the present invention is isolated to the 
extent desired by well known techniques such as, for 
example, using DEAE Sephadex. The antibodies produced 
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according to the present invention may be further purified 
so as to obtain specific classes or subclasses of antibody 
such as IgM, IgG, IgA, and the like. Antibodies of the 
IgG class are preferred for purposes of passive 

^ protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases caused 
by hepatitis C virus in animals, especially mammals, and 

10 most especially humans. 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 
general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
the recipient mammal with a dosage of antibodies in the 
20 range of from about 1 mg/kg body weight to about 10 mg/kg 
body weight of the mammal, although a lower or higher dose 
may be administered if found desirable. Such antibodies 
will normally be administered by intravenous or 
intramuscular route as an inoculum. The antibodies of the 
present invention are intended to be provided to the 
recipient subject in an amount sufficient to prevent, 
lessen or attenuate the severity, extent or duration of 
any existing infection. 
30 The antibodies prepared by use of the nucleic 

acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic purposes. 
For example, the antibodies can be used as in vitro 
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diagnostic agents to test for the presence of HCV in 
biological samples taken from animals, especially humans. 
Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence, Western blot 
^ analysis and ELISAs. In one such embodiment, the 

biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used to 
detect the presence of HCV to which the antibodies are 
10 bound. 

Such assays may be, for example, a direct 
protocol (where the labeled first antibody is 
immunoreactive with the antigen, such as, for example, a 
polypeptide on the surface of the virus) , an indirect 

^ protocol (where a labeled second antibody is reactive with 
the first antibody) , a competitive protocol (such as would 
involve the addition of a labeled antigen) , or a sandwich 
protocol (where both labeled and unlabeled antibody are 

20 used) , as well as other protocols well known and described 
in the art . 

In one embodiment , an immunoassay method would 
utilize an antibody specific for HCV envelope determinants 
and would further comprise the steps of contacting a 
biological sample with the HCV- specific antibody and then 
detecting the presence of HCV material in the test sample 
using one of the types of assay protocols as described 
above. Polypeptides and antibodies produced according to 
30 the present invention may also be supplied in the form of 
a kit, either present in vials as purified material, or 
present in compositions and suspended in suitable diluents 
as previously described. 
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In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
comprises in combination a series of containers, each 
container a reagent needed for such assay. Thus, one such 
container would contain a specific amount of HCV-specific 
antibody as already described, a second container would 
contain a diluent for suspension of the sample to be 
tested, a third container would contain a positive control 
and an additional container would contain a negative 
control. An additional container could contain a blank. 

For all prophylactic, therapeutic and diagnostic 
uses, the antibodies of the invention and other reagents, 
plus appropriate devices and accessories, may be provided 
in the form of a kit so as to facilitate ready 
availability and ease of use. 

The present invention also relates to the use of 
nucleic acid sequences and polypeptides of the present 
20 invention to screen potential antiviral agents for 

antiviral activity against HCV. Such screening methods 
are known by those of skill in the art. Generally, the 
antiviral agents are tested at a variety of 
concentrations, for their effect on preventing viral 
replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity or 
of viral pathogenicity (and a low level of toxicity) in an 
animal model system. 
30 In one embodiment, animal cells (especially 

human cells) transfected with the nucleic acid sequences 
of the invention are cultured in vitro and the cells are 
treated with a candidate antiviral agent (a chemical, 
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peptide etc.) for antiviral activity by adding the 
candidate agent to the medium. The treated cells are then 
exposed, possibly under transfecting or fusing conditions 
known in the art, to the nucleic acid sequences of the 
^ present invention. A sufficient period of time would then 
be allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods known 
10 to those of ordinary skill in the art. Such methods 

include, but are not limited to, the detection of viral 
antigens in the cell, for example, by immunof luorescent 
procedures well known in the art; the detection of viral 
polypeptides by Western blotting using antibodies specific 
therefor; the detection of newly transcribed viral RNA 
within the cells by RT-PCR; and the detection of the 
presence of live, infectious virus particles by injection 
of cell culture medium or cell lysates into healthy, 
20 susceptible animals, with subsequent exhibition of the 
symptoms of HCV infection. A comparison of results 
obtained for control cells (treated only with nucleic acid 
sequence) with those obtained for treated cells (nucleic 
acid sequence and antiviral agent) would indicate, the 
degree, if any, of antiviral activity of the candidate 
antiviral agent. Of course, one of ordinary skill in the 
art would readily understand that such cells can be 
treated with the candidate antiviral agent either before 
30 or after exposure to the nucleic acid sequence of the 
present invention so as to determine what stage, or 
stages, of viral infection and replication said agent is 
effective against. 
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In an alternative embodiment, a protease such as 
NS3 protease produced from a nucleic acid sequence of the 
invention may be used to screen for protease inhibitors 
which may act as antiviral agents. The structural and 
5 nonstructural regions of the HCV genome, including 

nucleotide and amino acid locations, have been determined, 
for example, as depicted in Houghton, M. (1996), Fig. 1; 
and Major, M. E . et al . (1997), Table 1. 

10 _ Such above-mentioned protease inhibitors may 

take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may be 
screened using methods known to those of skill in the art 
(Houghton, M. (1996) and Major, M.E. et al . (1997)). For 

^ example, a substrate may be employed which mimics the 
protease 1 s natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 

20 with the protease and the candidate protease inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 

inhibitor are then determined. 

25 ... 

In yet another embodiment, a candidate antiviral 

agent (such as a protease inhibitor) may be directly 

assayed in vivo for antiviral activity by administering 

the candidate antiviral agent to a chimpanzee transfected 

30 with a nucleic acid sequence of the invention and then 

measuring viral replication in vivo via methods such as 

RT-PCR. Of course, the chimpanzee may be treated with the 

candidate agent either before or after transfection with 
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the infectious nucleic acid sequence so as to determine 
what stage, or stages, of viral infection and replication 
the agent is effective against. 

The invention also provides that the nucleic 
^ acid sequences, viruses and polypeptides of the invention 
may be supplied in the form of a kit, alone or in the form 
of a pharmaceutical composition. 

All scientific publication and/or patents cited 
10 herein are specifically incorporated by reference. The 
following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof . 

15 EXAMPLES 

MATERIALS AND METHODS 
For Examples 1-4 



20 



Collection of Virus 



Hepatitis C virus was collected and used as a 
source for the RNA used in generating the cDNA clones 
according to the present invention. Plasma containing 
strain H77 of HCV was obtained from a patient in the acute 
25 phase of transfusion-associated non-A, non-B hepatitis 

(Feinstone et al (1981) ) . Strain H77 belongs to genotype 
la of HCV (Ogata et al (1991), Inchauspe et al (1991)). 
The consensus sequence for most of its genome has been 
determined (Kolyakov et al (1996), Ogata et al (1991), 
Inchauspe et al (1991) and Farci et al (1996)). 
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RNA Purification 

Viral RNA was collected and purified by 
conventional means. In general, total RNA from 10 fil of 
5 H77 plasma was extracted with the TRIzol system (GIBCO 

BRL) . The RNA pellet was resuspended in 100 pi of 10 itiM 
dithiothreitol (DTT) with 5% (vol/vol) RNasin (20 - 40 
units//xl) (available from Promega) and 10 /xl aliquots were 
stored at -80°C. In subsequent experiments RT-PCR was 

10 

performed on RNA equivalent to 1 /xl of H77 plasma, which 
contained an estimated 10 s genome equivalents (GE) of HCV 
(Yanagi et al (1996)). 

Primers used in the RT-PCR process were deduced 
15 from the genomic sequences of strain H77 according to 

procedures already known in the art (see above) or else 
were determined specifically for use herein. The primers 
generated for this purpose are listed in Table 1. 

20 



25 



30 



35 



JSDOCID- <WO 9904008A2 I > 



WO 99/04008 



PCT/US98/14688 



10 



15 



20 



25 



30 



35 



- 36 - 



Table 1. Oligonucleotides used for PCR 
amplification of strain H77 of HCV 



Designation Sequence (5' -* 3')* 



H9261F 

H3'X58R 

H9282F 

H3'X45R 

H9375F 

H3'X-35R 

H9386F 

H3'X-38R 

HI 

Al 

H9417R 



GGCTACAGCGGGGGGAGACATTTATCACAGC 

TCATGCGGCTCACGGACCTTTCACAGCTAG 

GTC CAAGCTTA TCACAGCGTGTCTGATGCCCGGCCCCG 

CGTC TCTAGA GGACCTTTCACAGCTAGCCGTGACTAGGG 

TGAAGGTT.GGGGTAAACACTCCGGCCTCTTAGGCCATT 

ACATGATCTGCAGAGAGGCCAGTATCAGCACTCTC 

GTC CAAGCTTACGCGT AAACACTCCGGCCTC CTTAAG CCATTCCTG 

CGT CTCTAGA CATGATCTGCAGAGAGGCCAGTATCAGCACTCTCTGC 

TTTTTTTTGCGGCCGCTAATACGSACTCACTArAGCCAGCCCCCTGAT - 

GGGGGCGACACTCCACCATG 

ACTGTCTTCACGCAGAAAGCGTCTAGCCAT 

CGTCTCTAGACAGGAAATGGCTTAAGAGGCCGGAGTGTTTACC 



* HCV sequences are shown in plain text, non-HCV- specif ic 
sequences are shown in boldface and artificial cleavage sites 
used for cDNA cloning are underlined. The core sequenceof the 
T7 promoter i n primer HI is shown in italics. 

Primers for long RT-PCR were size-purified. 

cDNA Synthesis 

The RNA was denatured at 6S°C for 2 min, and 
cDNA synthesis was performed in a 20 /xl reaction volume 
with Superscript II reverse transcriptase (from GIBCO/BRL) 
at 42 °C for 1 hour using specific antisense primers as 
described previously (Tellier et al (1996)). The cDNA 
mixture was treated with RNase H and RNase Tl (GIBCO/BRL) 
for 2 0 min at 37 °C. 

Amplification and Cloning of the 3' UTR 

The 3' UTR of strain H77 was amplified by PCR 
in two different assays. In both of these nested PCR 
reactions the first round of PCR was performed in a total 
volume of 50 fil in 1 x buffer, 250 /xmol of each 
deoxynucleoside triphosphate (dNTP; Pharmacia) , 2 0 pmol 



DOCID: <WO 9904008A2_L> 



WO 99/04008 



PCT/US98/14688 



- 37 - 

o 

each of external sense and antisense primers, 1 fil of the 
Advantage KlenTaq polymerase mix (from Clontech) and 2 /xl 
of the final cDNA reaction mixture. In the second round 
of PCR, 5 /xl of the first round PCR mixture was added to 
5 45 /il of PCR mixture prepared as already described. Each 
round of PCR (35 cycles) , which was performed in a Perkin 
Elmer DNA thermal cycler 480, consisted of denaturation at 
94 °C for 1 min (in 1st cycle 1 min 3 0 sec) , annealing at 
-.10 60°C for 1 min and elongation at 68°C for 2 min. In one. 

experiment a region from NS5B to the conserved region of 
the 3' UTR was amplified with the external primers H9261F 
and H3'X58R, and the internal primers H9282F and H3'X45R 
15 (Table 1). In another experiment, a segment of the 

variable region to the very end of the 3' UTR was 
amplified with the external primers H9375F and H3'X-35R, 
and the internal primers H9386F and H3'X-38R (Table 1, 
Fig. 1) . Amplified products were purified with QIAquick 

20 

PCR purification kit (from QIAGEN) , digested with Hind III 
and Xba I (from Promega) , purified by either gel 
electrophoresis or phenol/chloroform extraction, and then 
cloned into the multiple cloning site of plasmid pGEM- 
25 9zf(-) (Promega) or pUCl9 (Pharmacia). Cloning of cDNA 

into the vector was performed with T4 DNA ligase (Promega) 
by standard procedures. 

Amplification of Near Full -Length H77 Genomes by Long PCR 

30 

The reactions were performed in a total volume 
of 50 fil in 1 x buffer, 250 ptmol of each dNTP, 10 pmol 
each of sense and antisense primers, 1 fil of the Advantage 
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KlenTaq polymerase mix and 2 fxl of the cDNA reaction 
mixture (Tellier et al (1996)) . A single PCR round of 35 
cycles was performed in a Robocycler thermal cycler (from 
Stratagene) , and consisted of denaturation at 99 °C for 35 
sec, annealing at 67 °C for 30 sec and elongation at 68 °C 
for 10 min during the first 5 cycles, 11 min during the 
next 10 cycles, 12 min during the following 10 cycles and 
13 min during the last 10 cycles. To amplify the complete 
ORF of HCV by long RT-PCR we used the, sense primers HI or 
Al deduced from the 5' UTR and the antisense primer H9417R 
deduced from the variable region of the 3' UTR (Table 1, 
Fig. 1) . 

Construction of Full -Length H77 cDNA Clones 

The long PCR products amplified with HI and 
H9417R primers were cloned directly into pGEM-9zf (-) after 
digestion with Not I and Xba I (from Promega) (as per 
Fig. 1) . Two clones were obtained with inserts of the 
expected size, pH21j and pH50 x . Next, the chosen 3' UTR 
was cloned into both pH21j and pH5 0j after digestion with 
Afl II and Xba I (New England Biolabs) . DH5a competent 
cells (GIBCO/BRL) were transformed and selected with LB 
agar plates containing 100 /xg/ml ampicillin (from SIGMA) . 
Then the selected colonies were cultured in LB liquid 
containing ampicillin at 30°C for -18-20 hrs 
( trans formants containing full-length or near full-length 
cDNA of H77 produced a very low yield of plasmid when 
cultured at 37 °C or for more than 24 hrs) . After small 
scale preparation (Wizard Plus Minipreps DNA Purification 



<3DOCID: <WO 9904008A2_I_> 



WO 99/04008 



PCT/US98/14688 



- 39 - 

O 

Systems, Promega) each plasmid was retransf ormed to select 
a single clone, and large scale preparation of plasmid DNA 
was performed with a QIAGEN plasmid Maxi kit. 

5 Cloning of Long RT-PCR Products Into a Cassette Vector 

To improve the efficiency of cloning, a vector 
with consensus 5' and 3' termini of HCV strain H77 was 
constructed (Fig. 1) . This cassette vector (pCV) was 
*0 obtained by cutting out the BairiHI fragment (nts 1358 - 

7530 of the H77 genome) from pH50, followed by religation. 
Next, the long PCR products of H77 amplified with HI and 
H9417R or Al and H9417R primers were purified (Geneclean 
15 spin kit; BIO 101) and cloned into pCV after digestion 

with Age I and Afl II (New England Biolabs) or with Pin AI 
(isoschizomer of Age I) and Bfr I (isoschizomer of Afl II) 
(Boehringer Mannheim) . Large scale preparations of the 
plasmids containing full-length cDNA of H77 were performed 
^ as described above. 

Construction of H77 Consensus Chimeric cDNA Clone 

A full-length cDNA clone of H77 with an ORF 
25 encoding the consensus amino acid sequence was constructed 

by making a chimera from four of the cDNA clones obtained 
above. This consensus chimera, pCV-H77C, was constructed 
in two ligation steps by using standard molecular 
procedures and convenient cleavage sites and involved 

30 

first a two piece ligation and then a three piece 
ligation. Large scale preparation of pCV-H77C was 
performed as already described. 
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In Vitro Transcription 

Plasmids containing the full-length HCV cDNA 
were linearized with Xba I (from Promega) , and purified by 
5 phenol/chloroform extraction and ethanol precipitation. A 
100 fxl reaction mixture containing 10 fxg of linearized 
plasmid DNA, 1 x transcription buffer, 1 mM ATP, CTP, GTP 
and UTP, 10mM DTT , 4% (v/v) RNasin (20-40 units/jzl) and 2 
ixl of T7 RNA polymerase (Promega) was incubated at 37 °C 
for 2 hrs. Five fxl of the reaction mixture was analyzed 
by agarose gel electrophoresis followed by ethidium 
bromide staining. The transcription reaction mixture was 
diluted with 400 fxl of ice-cold phosphate-buffered saline 
15 without calcium or magnesium, immediately frozen on dry 

ice and stored at -80 °C. The final nucleic acid mixture 
was injected into chimpanzees within 24 hrs. 

Intrahepatic Transfection of Chimpanzees 

20 

Laparotomy was performed and aliquot s from two 
transcription reactions were injected into 6 sites of the 
exposed liver (Emerson et al (1992) . Serum samples were 
collected weekly from chimpanzees and monitored for liver 

25 

enzyme levels and anti-HCV antibodies. Weekly samples of 
100 fxl of serum were tested for HCV RNA in a highly 
sensitive nested RT-PCR assay with AmpliTaq Gold (Perkin 
Elmer) (Yanagi et al (1996); Bukh et al (1992)). The 
30 genome titer of HCV was estimated by testing 10 -fold 

serial dilutions of the extracted RNA in the RT-PCR assay 
(Yanagi et al (1996) ) . The two chimpanzees used in this 
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study were maintained under conditions that met all 
requirements for their use in an approved facility. 

The consensus sequence of the complete ORF from 
HCV genomes recovered at week 2 post inoculation (p.i) 
5 was determined by direct sequencing of PGR products 
obtained in long RT-PCR with primers Al and H9417R 
followed by nested PCR of 10 overlapping fragments. The 
consensus sequence of the variable region of the 3 ' UTR 
10 was determined by direct sequencing of an amplicon 

obtained in nested RT-PCR as described above. Finally, we 
amplified selected regions independently by nested RT-PCR 
with AmpliTaq Gold. 

15 Sequence Analysis 

Both strands of DNA from PCR products, as well 
as plasmids, were sequenced with the ABI PRISM Dye 
Termination Cycle Sequencing Ready Reaction Kit using Taq 
20 DNA polymerase (Perkin Elmer) and about 100 specific sense 
and antisense sequence primers. 

The consensus sequence of HCV strain H77 was 
determined in two different ways. In one approach, 
overlapping PCR products were directly sequenced, and 
amplified in nested RT-PCR from the H77 plasma sample. 
The sequence analyzed (nucleotides (nts) 35-9417) included 
the entire genome except the very 5' and 3' termini. In 
the second approach, the consensus sequence of nts 157- 
93 84 was deduced from the sequences of 18 full-length cDNA 
clones . 
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EXAMPLE 1 



Variability in the sequence of the 3' UTR of HCV strain 
H77 

The heterogeneity of the 3' UTR was analyzed by 
cloning and sequencing of DNA amplicons obtained in nested 
RT-PCR. 19 clones containing sequences of the entire 
variable region, the poly U-UC region and the adjacent 19 
nt of the conserved region, and 65 clones containing 
sequences of the entire poly U-UC region and the first 63 
nts of the conserved region were analyzed. This analysis 
confirmed that the variable region consisted of 43 nts, 
including two conserved termination codons (Han et al 
*5 (1992)) . The sequence of the variable region was highly 

conserved within H77 since only 3 point mutations were 
found among the 19 clones analyzed. A poly U-UC region 
was present in all 84 clones analyzed. However, its 
length varied from 71-141 nts. The length of the poly U 
region was 9-103 nts, and that of the poly UC region was 
35-85 nts. The number of C residues increased towards the 
3' end of the poly UC region but the sequence of this 
region is not conserved. The first 63 nts of the 
conserved region were highly conserved among the clones 
analyzed, with a total of only 14 point mutations. To 
confirm the validity of the analysis, the 3' UTR was 
reamplified directly from a full-length cDNA clone of HCV 
30 (see below) by the nested- PCR procedure with the primers 

in the variable region and at the very 3' end of the HCV 
genome and cloned the PCR product. Eight clones had 1-7 
nt deletions in the poly U region. Furthermore, although 
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the C residues of the poly UC region were maintained, the 
spacing of these varied because of 1-2 nt deletions of U 
residues. These deletions must be artifacts introduced by 
PCR and such mistakes may have contributed to the 
^ heterogeneity originally observed in this region. 

However, the conserved region of the 3' UTR was amplified 
correctly, suggesting that the deletions were due to 
difficulties in transcribing a highly repetitive sequence. 
10 one of the 3' UTR clones was- selected for 

engineering of full-length cDNA clones of H77 . This clone 
had the consensus variable sequence except for a single 
point mutation introduced to create an Afl II cleavage 
site, a poly U-UC stretch of 81 nts with the most commonly 
observed UC pattern and the consensus sequence of the 
complete conserved region of 101 nts, including the distal 
38 nts which originated from the antisense primer used in 
the amplification. After linearization with Xba I, the 
DNA template of this clone had the authentic 3' end. 

EXAMPLE 2 



15 
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The Entire Open Reading Frame of H77 
Amplified in One Round of Long RT-PCR 

It had been previously demonstrated that a 9.25 
kb fragment of the HCV genome from the 5' UTR to the 3' 
end of NS5B could be amplified from 10 4 GE (genome 
equivalents) of H77 by a single round of long RT-PCR 
(Tellier et al (1996a)). In the current study, by 
optimizing primers and cycling conditions, the entire ORF 
of H77 was amplified in a single round of long RT-PCR with 
primers from the 5' UTR and the variable region of the 3' 
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UTR. In fact, 9.4 kb of the H77 genome (H product: from 
the very 5' end to the variable region of the 3' UTR) could 
be amplified from 10 s GE or 9.3 kb (A product: from within 
the 5' UTR to the variable region of the 3' UTR) from 10 4 
GE or 10 5 GE, in a single round of long RT-PCR (Fig. 2) . 
The PCR products amplified from 10 5 GE of H77 were used for 
engineering full-length cDNA clones (see below) . 

EXAMPLE 3 

Construction of Multiple Full -Length 
cDNA Clones of H77 in a Single Step by 
Cloning of Long RT-PCR Ampl icons Directly 
into a Cassette Vector with Fixed 5' and 3' Termini 



Direct cloning of the long PCR products (H) , 
which contained a 5' T7 promoter, the authentic 5' end, the 
entire ORF of H77 and a short region of the 3' UTR, into 
pGEM-9zf (-) vector by Wot I and Xba I digestion was first 
attempted. However, among the 70 clones examined all but 
two had inserts that were shorter than predicted. Sequence 
analysis identified a second Not I site in the majority of 
clones, which resulted in deletion of the nts past 
position 9221. Only two clones (pH21j and pH50 x ) were 
missing the second Not I site and had the expected 5' and 
3' sequences of the PCR product. Therefore, full-length 
cDNA clones (pH21 and pH50) were constructed by inserting 
30 the chosen 3' UTR into pH21j and pH50 Jf respectively. 

Sequence analysis revealed that clone pH21 had a complete 
full-length sequence of H77; this clone was tested for 
infectivity. The second clone, pH50, had one nt deletion 
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in the ORF at position 6365; this clone was used to make a 
cassette vector. 

The complete ORF was amplified by constructing a 

cassette vector with fixed 5' and 3' termini as an 
5 intermediate of the full-length cDNA clones. This vector 
(pCV) was constructed by digestion of clone pH50 with 
BairiHI, followed by religation, to give a shortened plasmid 
readily distinguished from plasmids containing the full- 
10 length insert. Attempts to clone -.long RT-PCR products (H) 
into pCV by Age I and Afl II yielded only 1 of 23 clones 
with an insert of the expected size. In order to increase 
the efficiency of cloning, we repeated the procedure but 
used Pin A I and Bfxr I instead of the respective 
isoschizomers Age I and Afl II. By this protocol, 24 of 
31 H clones and 30 of 35 A clones had the full-length cDNA 
of H77 as evaluated by restriction enzyme digestion. A 
total of 16 clones, selected at random, were each 
20 retransformed, and individual plasmids were purified and 
completely sequenced . 

EXAMPLE 4 

25 Demonstration of Infectious Nature 

of Transcripts of a cDNA Clone 
Representing the Consensus Sequence of St rain H77 

A consensus chimera was constructed from 4 of 
the full-length cDNA clones with just 2 ligation steps. 
The final construct, pCV-H77C, had 11 nt differences from 
the consensus sequence of H77 in the ORF (Fig. 3) . 
However, 10 of these nucleotide differences represented 
silent mutations. The chimeric clone differed from the 
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consensus sequence at only one amino acid [L instead of F 
at position 790] . Among the 18 ORFs analyzed above, the F 
residue was found in 11 clones and the L residue in 7 
clones. However, the L residue was dominant in other 
^ isolates of genotype la, including a first passage of H77 
in a chimpanzee (Inchauspe et al (1991)) . 

To test the infectivity of the consensus 
chimeric clone of H77 intrahepatic transfection of a 
10 chimpanzee was performed. The pCV-H77C clone was 

linearized with Xba I and transcribed in vitro by T7 RNA 
polymerase (Fig, 2) . The transcription mixture was next 
injected into 6 sites of the liver of chimpanzee 1530. 
The chimpanzee became infected with HCV as measured by 
detection of 10 2 GE/ml of viral genome at week 1 p.i. 
Furthermore, the HCV titer increased to 10 4 GE/ml at week 
2 p.i., and reached 10 6 GE/ml by week 8 p.i. The viremic 
pattern observed in the early phase of the infection with 
20 the recombinant virus was similar to that observed in 

chimpanzees inoculated intravenously with strain H77 or 
other strains of HCV (Shimizu (1990) ) . 

The sequence of the HCV genomes from the serum 
sample collected at week 2 p.i. was analyzed. The 
consensus sequence of nts 298-93 75 of the recovered 
genomes was determined by direct sequencing of PCR 
products obtained in long RT-PCR followed by nested PCR of 
10 overlapping fragments. The identity to clone pCV-H77C 
30 sequence was 100%. The consensus sequence of nts 96- 
291,1328-1848, 3585-4106, 4763-5113 and 9322-9445 was 
determined from PCR products obtained in different nested 
RT-PCR assays. The identity of these sequences with pCV- 
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H77C was also 100%. These latter regions contained 4 
mutations unique to the consensus chimera, including the 
artificial Afl.II cleavage site in the 3' UTR. Therefore, 
RNA transcripts of this clone of HCV were infectious. 
^ The infectious nature of the consensus chimera 

indicates that the regions of the 5' and 3' UTRs 
incorporated into the cassette vector do not destroy 
viability- This makes it highly advantageous to use the 
10 cassette vector to construct infectious cDNA clones of 

other HCV strains when the consensus sequence for each ORF 
is inserted. 

In addition, two complete full-length clones 
(dubbed pH21 and pCV-Hll) constructed were not infectious, 
as shown by intrahepatic injection of chimpanzees with the 
corresponding RNA transcripts. Thus, injection of the 
transcription mixture into 3 sites of the exposed liver 
resulted in no observable HCV replication and weekly serum 
samples were negative for HCV RNA at weeks 1-17 p.i.^in 
a highly sensitive nested RT-PCR assay. The cDNA template 
injected along with the RNA transcripts was also not 
detected in this assay. 

Moreover, the chimpanzee remained negative for 
antibodies to HCV throughout the follow-up. Subsequent 
sequence analysis revealed that 7 of 16 additional clones 
were defective for polyprotein synthesis and all clones 
had multiple amino acid mutations compared with the 
consensus sequence of the parent strain. For example, 
clone pH21, which was not infectious, had 7 amino acid 
substitutions in the entire predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The most 
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notable mutation was at position 1026 , which changed L to 
Q, altering the cleavage site between NS2 and NS3 (Reed 
(1995)). Clone pCV-Hll, also non-infectious, had 21 amino 
acid substitutions in the predicted polyprotein compared 
with the consensus sequence of H77 (Fig. 3) . The amino 
acid mutation at position 564 eliminated a highly- 
conserved C residue in the E2 protein (Okamoto (1992a)) . 

EXAMPLE 4A 

The chimpanzee of Example 4, designated 1530, 
was monitored out to 32 weeks p.i. for serum enzyme levels 
(ALT) and the presence of anti-HCV antibodies, HCV RNA, 
and liver histopathology . The results are shown in Figure 
15 18B. 

A second chimp, designated 1494, was also 
transfected with RNA transcripts of the pCV-H77C clone and 
monitored out to 17 weeks p.i. for the presence of anti- 
HCV antibodies, HCV RNA and elevated serum enzyme levels. 
The results are shown in Figure 18A. 

MATERIALS AND METHODS 
for Examples 5-10 



20 



25 



30 



Source Of HCV Genotype lb 

An infectious plasma pool (second chimpanzee 
passage) containing strain HC-J4, genotype lb, was 
prepared from acute phase plasma of a chimpanzee 
experimentally infected with serum containing HC-J4/91 
(Okamoto et al . , 1992b). The HC-J4/91 sample was obtained 
from a first chimpanzee passage during the chronic phase 
of hepatitis C about 8 years after experimental infection. 
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The consensus sequence of the entire genome, except for 
the very 3' end, was determined previously for HC-J4/91 
(Okamoto et al . , 1992b). 

5 Preparation Of HCV RNA 

Viral RNA was extracted from 100 /xl aliquots of 
the HC-J4 plasma pool with the TRIzol system (GIBCO BRL) . 
The RNA pellets were each resuspended in 10 /xl of 10 mM 
10 dithiothreitol (DTT) r with 5% (vol/vol) RNasin (20-40 

units//xl) (Promega) and stored at -8 0°C or immediately 
used for cDNA synthesis . 
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Amplification And Cloning Of The 3' UTR 



A region spanning from NS5B to the conserved 
region of the 3' UTR was amplified in nested RT-PCR using 
the procedure of Yanagi et al . , (1997). 
20 In brief, the RNA was denatured at 65°C for 2 

minutes, and cDNA was synthesized at 42°C for 1 hour with 
Superscript II reverse transcriptase (GIBCO BRL) and 
primer H3'X58R (Table 1) in a 20 /xl reaction volume. The 
cDNA mixture was treated with RNase H and RNase Tl (GIBCO 
BRL) at 37°C for 20 minutes. The first round of PCR was 
performed on 2 /xl of the final cDNA mixture in a total 
volume of 5 0 /il with the Advantage cDNA polymerase mix 
(Clontech) and external primers H9261F (Table 1) and 
H3'X58R (Table 1) • In the second round of PCR [internal 
primers H9282F (Table 1) and H3'X45R (Table 1)], 5 /xl of 
the first round PCR mixture was added to 4 5 /xl of the PCR 
reaction mixture. Each round of PCR (35 cycles), was 
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performed in a DNA thermal cycler 480 (Perkin Elmer) and 
consisted of denaturation at 94°C for 1 minute (1st cycle: 
1 minute 30 sec) , annealing at 60°C for 1 minute and 
elongation at 68°C for 2 minutes. After purification with 
QIAquick PCR purification kit (QIAGEN) , digestion with 
Hindi I I and Xbal (Promega) , and phenol /chloroform 
extraction, the amplified products were cloned into 
pGEM-9zf(-) (Promega) (Yanagi et al . , 1997), 

Amplification And Cloning Of The Entire ORF 



A region from within the 5' UTR to the variable 
region of the 3' UTR of strain HC-J4 was amplified by long 
RT-PCR (Fig. 1) (Yanagi et al . , 1997). The cDNA was 
synthesized at 4 2°C for 1 hour in a 20 /il reaction volume 
with Superscript II reverse transcriptase and primer J4- 
94 05R ( 5' -GCCTATTGGCCTGGAGTGGTTAGCTC- 3' ) , and treated with 
20 RNases as above. The cDNA mixture (2 fil) was amplified by 
long PCR with the Advantage cDNA polymerase mix and 
primers Al (Table 1) (Bukh et al . , 1992; Yanagi et al . , 
1997) and J4-9398R (5'- 
25 AGrGATGGCCTTAAGGCCTGGAGTGGTTAGCTCCCCGTTCA- 3 ' ) . Primer J4- 

9398R contained extra bases (Jbold) and an artificial Aflll 
cleavage site (underlined) . A single PCR round was 
performed in a Robocycler thermal cycler (Stratagene) , and 
consisted of denaturation at 99°C for 35 seconds, 

30 

annealing at 67°C for 30 seconds and elongation at 68°C 
for 10 minutes during the first 5 cycles, 11 minutes 
during the next 10 cycles, 12 minutes during the following 
10 cycles and 13 minutes during the last 10 cycles. 
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After digesting the long PCR products obtained 
front strain HC-J4 with PlnAI (isoschizomer of Agel) and 
Bfrl (isoschizomer of Aflll) (Boehringer Mannheim) , 
attempts were made to clone them directly into a cassette 
^ vector (pCV) , which contained the 5' and 3' termini of 
strain H77 (Figure 1) but no full-length clones were 
obtained. Accordingly, to improve the efficiency of 
cloning, the PCR product was further digested with Bgrlll 
10 (Boehringer Mannheim) and the two resultant genome 

fragments [L fragment: PinAI/Bgrlll , nts 156 - 8935; S 
fragment: Bglll/Brfl , nts 8936 - 9398] were separately 
cloned into pCV (Figure 6) . 

DH5ot competent cells (GIBCO BRL) were 
transformed and selected on LB agar plates containing 100 
/xg/ml ampicillin (SIGMA) and amplified in LB liquid 
cultures at 3 0°C for 18-2 0 hours. 

Sequence analysis of 9 plasmids containing the S 

20 

fragment (miniprep samples) and 9 plasmids containing the 

L fragment (maxiprep samples) were performed as described 

previously (Yanagi et al . , 1997). Three L fragments, each 

encoding a distinct polypeptide, were cloned into pCV-J4S9 

25 (which contained an S fragment encoding the consensus 

amino acid sequence of HC-J4) to construct three chimeric 

full-length HCV cDNAs (pCV-J4L2S, pCV-J4L4S and pCV-J4L6S) 

(Fig. 6) . Large scale preparation of each clone was 

performed as described previously with a QIAGEN plasmid 
30 . 

Maxi kit (Yanagi et al . , 1997) and the authenticity of 

each clone was confirmed by sequence analysis. 



35 



I'SOOCfD- <Wn 9904008 A ° ' - 



WO 99/04008 



PC77US98/14688 



- 52 - 

o 

Sequence Analysis 

Both strands of DNA were sequenced with the ABI 
PRISM Dye Termination Cycle Sequencing Ready Reaction Kit 
5 using Taq DNA polymerase (Perkin Elmer) and about 90 

specific sense and antisense primers. Analyses of genomic 
sequences, including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford Molecular 
Group) (Bukh et al . , 1995). 

10 

The consensus sequence of strain HC-J4 was 
determined by direct sequencing of PCR products (nts 11 - 
9412) and by sequence analysis of multiple cloned L and S 
fragments (nts 156 -9371). The consensus sequence of the 
15 3' UTR (3' variable region, polypyrimidine tract and the 
first 16 nucleotides of the conserved region) was 
determined by analysis of 24 cDNA clones. 

Intrahepatic Transfection Of A Chimpanzee 
20 With Transcribed RNA 

Two in vitro transcription reactions were 
performed with each of the three full-length clones. In 
each reaction 10 fig of plasmid DNA linearized with Xba I 
(Promega) was transcribed in a 100 fil reaction volume with 
T7 RNA polymerase (Promega) at 3 7°C for 2 hours as 
described previously (Yanagi et al . , 1997). Five fil of 
the final reaction mixture was analyzed by agarose gel 
30 electrophoresis and ethidium bromide staining (Fig. 5) . 
Each transcription mixture was diluted with 400 /il of 
ice-cold phosphate-buf f ered saline without calcium or 
magnesium and then the two aliquots from the same cDNA 
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clone were combined, immediately frozen on dry ice and 
stored at -80°C. Within 24 hours after freezing the 
transcription mixtures were injected into the chimpanzee 
by percutaneous intrahepatic injection that was guided by 
ultrasound. Each inoculum was individually injected (5-6 
sites) into a separate area of the liver to prevent 
complementation or recombination. The chimpanzee was 
maintained under conditions that met all requirements for 
10 its use in an. approved facility. 

Serum samples were collected weekly from the 
chimpanzee and monitored for liver enzyme levels and 
anti-HCV antibodies. Weekly samples of 100 ^1 of serum 
were tested for HCV RNA in a sensitive nested RT-PCR assay 
(Bukh et al. # 1992 , Yanagi et al . , 1996) with AmplxTaq 
Gold DNA polymerase. The genome equivalent (GE) titer of 
HCV was determined by testing 10 -fold serial dilutions of 
the extracted RNA in the RT-PCR assay (Yanagi et al., 
20 1996) with 1 GE defined as the number of HCV genomes 

present in the highest dilution which was positive in the 
RT-nested PCR assay. 

To identify which of the three clones was 
infectious in vivo , the NS3 region (nts 3659 - 4110) from 
the chimpanzee serum was amplified in a highly sensitive 
and specific nested RT-PCR assay with AmpliTaq Gold DNA 
polymerase and the PCR products were cloned with a TA 
cloning kit (Invitrogen) . In addition, the consensus 
^ sequence of the nearly complete genome (nts 11 - 9441) was 
determined by direct sequencing of overlapping PCR 
products . 



35 



JSDOCID' <WO 9904008A2_I_> 



WO 99/04008 



PCTYUS98/14688 



15 



20 



30 



- 54 



EXAMPLE 5 



Sequence Analysis Of Infectious Plasma Pool 
Of Strain HC-J4 Used As The Cloning Source 

5 

As an infectious cDNA clone of a genotype la 
strain of HCV had been obtained only after the ORF was 
engineered to encode the consensus polypeptide (Kolykhalov 
et al., 1997; Yanagi et al . , 1997), a detailed sequence 
10 , analysis of the cloning source was performed to determine 
the consensus sequence prior to constructing an infectious 
cDNA clone of a lb genotype . 

A plasma pool of strain HC-J4 was prepared from 
acute phase plasmapheresis units collected from a 
chimpanzee experimentally infected with HOJ4/91 (Okamoto 
et al., 1992b) . This HCV pool had a PCR titer of 10 4 - 
10 5 GE/ml and an infectivity titer of approximately 10 3 
chimpanzee infectious doses per ml . 

The heterogeneity of the 3' UTR of strain HC-J4 
was determined by analyzing 24 clones of nested RT-PCR 
product. The consensus sequence was identical to that 
previously published for HC-J4/91 (Okamoto et al . , 1992b), 
25 except at position 94 07 (see below) . The variable region 
consisted of 41 nucleotides (nts. 9372 - 9412), including 
two in-frame termination codons . Furthermore, its 
sequence was highly conserved except at positions 9399 (19 
A and 5 T clones) and 94 07 (17 T and 7 A clones) . The 
poly U-UC region varied slightly in composition and 
greatly in length (31-162 nucleotides) . In the conserved 
region, the first 16 nucleotides of 22 clones were 
identical to those previously published for other genotype 
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1 strains, whereas two clones each had a single point 
mutation. These data suggested that the structural 
organization at the 3' end of HC-J4 was similar to that of 
the infectious clone of a genotype la strain of Yanagi et 

5 

al (1997) . 

Next, the entire ORF of HC-J4 was amplified in a 
single round of long RT-PCR (Figure 5) . The original plan 
was to clone the resulting PGR products into the PinAI and 
10 Brfl site of a HCV cassette vector (pCV) , which had fixed 
5' and 3' termini of genotype la (Yanagi et al . , 1997) but 
since full-length clones were not obtained, two genome 
fragments (L and S) derived from the long RT-PCR products 
(Figure 6) were separately subcloned into pCV. 

To determine the consensus sequence of the ORF, 
the sequence of 9 clones each of the L fragment (pCV-J4L) 
and of the S fragment (pCV-J4S) was determined and . 
quasispecies were found at 275 nucleotide (3.05 %) and 78 
20 amino acid (2.59 %) positions, scattered throughout the 
9030 nts (3010 aa) of the ORF (Figure 7) . Of the 161 
nucleotide substitutions unique to a single clone, 71% 
were at the third position of the codon and 72 % were 
silent . 

Each of the nine L clones represented the near 
complete ORF of an individual genome. The differences 
among the L clones were 0.3 0 - 1.53% at the nucleotide and 
0.31 - 1.47% at the amino acid level (Figure 8). Two 
clones, LI and L7, had a defective ORF due to a single 
nucleotide deletion and a single nucleotide insertion, 
respectively. Even though the HC-J4 plasma pool was 
obtained in the early acute phase, it appeared to contain 
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at least three viral species (Figure 9) . Species A 
contained the LI, L.2 , L6, L8 and L9 clones, species B the 
L3, L7 and L10 clones and species C the L4 clone. 
Although each species A clone was unique all A clones 
5 differed from all B clones at the same 20 amino acid sites 
and at these positions, species C had the species A 
sequence at 14 positions and the species B sequence at 6 
positions (Figure 7) . 

lO Okamoto and coworkers (Okamoto et al., 1992b) 

previously determined the nearly complete genome consensus 
sequence of strain HC-J4 in acute phase serum of the first 
chimpanzee passage (HC-J4/83) as well as in chronic phase 
serum collected 8.2 years later (HC-J4/91) . In addition, 

l ^ they determined the sequence of amino acids 379 to 413 
(including HVR1) and amino acids 468 to 486 (including 
HVR2) of multiple individual clones (Okamoto et al . , 
1992b) • 

20 It was found by the present inventors that the 

sequences of individual genomes in the plasma pool 
collected from a chimpanzee inoculated with HC-J4/91 were 
all more closely related to HC-J4/91 than to HC-J4/83 
(Figures 8, 9) and contained HVR amino acid sequences 
closely related to three of the four viral species 
previously found in HC-J4/91 (Figure 10) . 

Thus, the data presented herein demonstrate the 
occurrence of the simultaneous transmission of multiple 

30 species to a single chimpanzee and clearly illustrates the 
difficulties in accurately determining the evolution of 
HCV over time since multiple species with significant 
changes throughout the HCV genome can be present from the 
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onset of the infection. Accordingly, infection of 
chimpanzees with monoclonal viruses derived from the 
infectious clones described herein will make it possible 
to perform more detailed studies of the evolution of HCV 
^ in vivo and its importance for viral persistence and 
pathogenesis. 

EXAMPLE 6 

10 Determination Of The Consensus 

Sequence Of HC-J4 In The Plasma Pool 

The consensus sequence of nucleotides 156-93 71 
of HC-J4 was determined by two approaches. In one 
approach, the consensus sequence was deduced from 9 clones 
of the long RT-PCR product. In the other approach the 
long RT-PCR product was reamplified by PGR as overlapping 
fragments which were sequenced directly. The two 
"consensus" sequences differed at 31 (0.34%) of 9216 
20 nucleotide positions and at 11 (0.37%) of 3010 deduced 
amino acid positions (Figure 7) . At all of these 
positions a major quasispecies of strain HC-J4 was found 
in the plasma pool. At 9 additional amino acid positions 
the cloned sequences displayed heterogeneity and the 
direct sequence was ambiguous (Figure 7) . Finally, it 
should be noted that there were multiple amino acid 
positions at which the consensus sequence obtained by 
direct sequencing was identical to that obtained by 
cloning and sequencing even though a major quasispecies 
was detected (Figure 7) . 

For positions at which the two "consensus 11 
sequences of HC-J4 differed, both amino acids were 
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included in a composite consensus sequence (Figure 7) . 
However, even with this allowance, none of the 9 L clones 
analyzed (aa 1 - 2864) had the composite consensus 
sequence: two clones did not encode the complete 
^ polypeptide and the remaining 7 clones differed from the 
consensus sequence by 3 -.. 13 amino acids (Figure 7) . 

EXAMPLE 7 

10 Construction Of Chimeric Full-Length cDNA 

Clones Containing The Entire ORF Of HC-J4 

The cassette vector used to clone strain H77 was 
used to construct an infectious cDNA clone containing the 
ORF of a second subtype. 

In brief, three full-length cDNA clones were 
constructed by cloning different L fragments into the 
PinAI/Bgrlll site of pCV-J4S9, the cassette vector for 
genotype la (Figure 6) , which also contained an S fragment 
20 encoding the consensus amino acid sequence of HC-J4. 

Therefore, although the ORF was from strain HC-J4, most of 
the 5' and 3' terminal sequences originated from strain 
H77 . As a result, the 5' and 3' UTR were chimeras of 
25 genotypes la and lb (Figure 11) . 

The first 155 nucleotides of the 5' UTR were 
from strain H77 (genotype la) , and differed from the 
authentic sequence of HC-J4 (genotype lb) at nucleotides 
11, 12, 13, 34 and 35. In two clones (pCV-J4L2S, pCV- 

30 

J4L6S) the rest of the 5' UTR had the consensus sequence 
of HC-J4, whereas the third clone (pCV-J4L4S) had a single 
nucleotide insertion at position 207. In all 3 clones the 
first 27 nucleotides of the 3' variable region of the 3' 
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UTR were identical with the consensus sequence of HC-J4. 
The remaining 15 nucleotides of the variable region, the. 
poly U-UC region and the 3' conserved region of the 3' UTR 
had the same sequence as an infectious clone of strain H77 

5 

(Figure 11) . 

None of the three full-length clones of HC-J4 
had the ORF composite consensus sequence (Figures 7 , 12) . 
The pCV-J4L6S clone had only three amino acid changes: Q 
10 for R at position 231 (El) , V for A at position 937 (NS2) : ,. 
and T for S at position 1215 (NS3). The pCV-J4L4S clone 
had 7 amino acid changes, including a change at position 
450 (E2) that eliminated a highly conserved N- linked 
glycosylation site (Okamoto et al . , 1992a). Finally, the 
pCV-J4L2S clone had 9 amino acid changes compared with the 
consensus sequence of HC-J4. A change at position 304 
(El) mutated a highly conserved cysteine residue (Bukh et 
al., 1993; Okamoto et al. , 1992a). 

20 

EXAMPLE 8 

Transfection Of A Chimpanzee By In 
Vitro Transcripts Of A Chimeric cDNA 

25 The infectivity of the three chimeric HCV clones 

was determined by ultra-sound-guided percutaneous 
intrahepatic injection into the liver of a chimpanzee of 
the same amount of cDNA and transcription mixture for each 
of the clones (Figure 5) . This procedure is a less 

3 ^ invasive procedure than the laparotomy procedure utilized 
by Kolykhalov et al . (1997) and Yanagi et al . (1997) and 
should facilitate in vivo studies of cDNA clones of HCV in 
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chimpanzees since percutaneous procedures, unlike 
laparotomy, can be performed repeatedly. 

As shown in Figure 13 , the chimpanzee became 
infected with HCV as measured by increasing titers of 10 2 
^ GE/ml at week 1 p-i., 10 3 GE/ml at week 2 p.i. and 10 4 - 
10 5 GE/ml during weeks 3 to 10 p.i. 

The viremic pattern found in the early phase of 
the infection was similar to that observed for the 

10 recombinant H77 virus in chimpanzees (Bukh et al . , 

unpublished data; Kolykhalov et al . , 1997; Yanagi et al . , 
1997) . The chimpanzee transfected in the present study 
was chronically infected with hepatitis G virus (HGV/GBV- 
C) (Bukh et al . , 1998) and had a titer of 10 6 GE/ml at the 
time of HCV transf ection . Although HGV/GBV-C was 
originally believed to be a hepatitis virus, it does not 
cause hepatitis in chimpanzees (Bukh et al., 1998) and may 
not replicate in the liver (Laskus et al., 1997). The 

20 present study demonstrated that an ongoing infection of 
HGV/GBV-C did not prevent acute HCV infection in the 
chimpanzee model . 

However, to identify which of the three full- 

^ length HC-J4 clones were infectious, the NS3 region (nts. 

3659 - 4110) of HCV genomes amplified by RT-PCR from serum 
samples taken from the infected chimpanzee during weeks 2 
and 4 post-infection (p.i.) were cloned and sequenced. As 
the PCR primers were a complete match with each of the 

30 original three clones, this assay should not have 

preferentially amplified one virus over another. Sequence 
analysis of 26 and 24 clones obtained at weeks 2 and 4 
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p.i., respectively, demonstrated that all originated from 
the transcripts of pCV-J4L6S. 

Moreover, the consensus sequence of PCR products 
of the nearly complete genome (nts. 11-9441), amplified 
^ from serum obtained during week 2 p.i., was identical to 
the sequence of pCV-J4L6S and there was no evidence of 
quasispecies . Thus, RNA transcripts of pCV-J4L6S, but not 
of pCV-J4L2S or pCV- J4L4S , were infectious in vivo. The 

10 data in Figure 13 is therefore the product of the 
transfection of RNA transcripts of pCV-J4L6S. 

In addition, the chimeric sequences of genotypes 
la and lb in the UTRs were maintained in the infected 
chimpanzee. The consensus sequence of nucleotides 11 - 

15 341 of the 5' UTR and the variable region of the 3' UTR, 
amplified from serum obtained during weeks 2 and 4 p.i., 
had the expected chimeric sequence of genotypes la and lb 
(Fig. 11). Also three of four clones of the 3' UTR 

20 obtained at week 2 p.i. had the chimeric sequence of the 
variable region, whereas a single substitution was noted 
in the fourth clone. However, in all four clones the poly 
U region was longer (2-12 nts) than expected. Also, extra 

25 C and G residues were observed in this region. For the 

most part, the number of C residues in the poly UC region 
was maintained in all clones although the spacing varied. 
As shown previously, variations in the number of U 
residues can reflect artifacts introduced during PCR 

on 

amplification (Yanagi et al . , 1997). The sequence of the 
first 19 nucleotides of the conserved region was 
maintained in all four clones. Thus, with the exception 
of the poly U-UC region, the genomic sequences recovered 
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from the infected chimpanzee were exactly those of the 
chimeric infectious clone pCV-J4BL6S. 

The results presented in Figure 13 therefore 
demonstrate that HCV polypeptide sequences other than the 

5 

consensus sequence can be infectious and that a chimeric 
genome containing portions of the H77 termini could 
produce an infectious virus. In addition, these results 
showed for the first time that it is possible to make 
10 infectious viruses containing 5' and 3' terminal sequences 
specific for two different subtypes of the same major 
genotype of HCV. 
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EXAMPLE 9 

Construction Of A Chimeric 
la/lb Infectious Clone 



A chimeric la/lb infectious clone in which the 
structural region of the genotype lb infectious clone is 
inserted into the la clone of Yanagi et al . (1997) is 
constructed by following the protocol shown in Figure 15. 
The resultant chimera contains nucleotides 156-2763 of the 
lb clone described herein inserted into the la clone of 
25 Figures 4A-4F. The sequences of the primers shown in 
Figure 15 which are used in constructing this chimeric 
clone, designated pH77CV-J4, are presented below. 
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1. H2751S (Cla I/Nde I) 
CGT CAT CGA TCC TCA GCG GGC ATA TGC ACT GGA CAC GGA 

2 . H2870R 
CAT GCA CCA GCT GAT ATA GCG CTT GTA ATA TG 

3 . H7851S 
TCC GTA GAG GAA GCT TGC AGC CTG ACG CCC 

4 . H9173 R(P-M) 
GTA CTT GCC ACA TAT AGC AGC CCT GCC TCC TCT G 

5 . H9140S (P-M) 
CAG AGG AGG CAG GGC TGC TAT ATG TGG CAA GTA: ..C 

6 . H9417R 

CGT CTC TAG ACA GGA AAT GGC TTA AGA GGC CGG AGT GTT 
TAC C 

7 . J4-2271S 
TGC AAT TGG ACT CGA GGA GAG CGC TGT AAC TTG GAG 

8 . J4-2776R (Nde I) 

CGG TCC AAG GCA TAT GCT CGT GGT GGT AAC GCC AG 

Transcripts of the chimeric la/lb clone (whose 
20 sequence is shown in Figures 16A-16F) are then produced 

and transfected into chimpanzees by the methods described 
in the Materials and Methods section herein and the 
transfected animals are then be subjected to biochemical 
(ALT levels) , histopathological and PCR analyses to 
determine the infectivity of the chimeric clone. 

EXAMPLE 10 

Construction of 3' Deletion Mutants 
30 Of The la Infectious Clone PCV-H77C 

Seven constructs having various deletions in the 
3' untranslated region (UTR) of the la infectious clone 
pCV-H77C were constructed as described in Figures 17A-17B. 
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The 3' untranslated sequence remaining in each of the 
seven constructs following their respective deletions is 
shown in Figures 17A-17B. 

Construct pCV-H77C ( - 98X) containing a deletion 
of the 3'-most 98 nucleotide sequences in the 3'-UTR was 
transcribed in vitro according to the methods described 
herein and 1 ml of the diluted transcription mixture was 
percutaneously transfected into the liver of a chimpanzee 
10 with the aid of ultrasound. After three weeks, the 

transfection was repeated. The chimpanzee was observed to 
be negative for hepatitis C virus replication as measured 
by RT-PCR assay for 5 weeks after transfection. These 
results demonstrate that the deleted 98 nucleotide 3'-UTR 
sequence was critical for production of infectious HCV and 
appear to contradict the reports of Dash et al . (1996) and 
Yoo et al . (1995) who reported that RNA transcripts from 
cDNA clones of HCV-1 and HCV-N lacking the terminal 98 
conserved nucleotides at the very 3' end of the 3'-UTR 
resulted in viral replication after transfection into 
human hematoma cell lines . 

Transcripts of the (-42X) mutant (Figure 17C) 
25 were also produced and transfected into a chimpanzee and 
transcripts of the other five deletion mutants shown in 
Figures 17D-17G) are to be produced and transfected into 
chimpanzees by the methods described herein. All 
transfected animals are to then be assayed for viral 
replication via RT-PCR. 
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In two recent reports on transfection of 
chimpanzees, only those clones engineered to have the 
independently determined and slightly different consensus 
amino acid sequence of the polypeptide of strain H77 were 
infectious (Kolykhalov et al . , 1997; Yanagi et al . , 1997). 
Although the two infectious clones differed at four amino 
acid positions, these differences were represented in a 
major component of the quasispecies of the cloning source. 
In the present study, a single consensus sequence of 
strain HC-J4 could not be defined because the consensus 
sequence obtained by two different approaches (direct 
15 sequencing and sequencing of cloned products) differed at 
20 amino acid positions, even though the same genomic PCR 
product was analyzed. The infectious clone differed at 
two positions from the composite amino acid consensus 
sequence, from the sequence of the 8 additional HC-J4 
clones analyzed in this study and from published sequences 
of earlier passage samples. An additional amino acid 
differed from the composite consensus sequence but was 
found in two other HC-J4 clones analyzed in this study. 
25 The two non- infectious full-length clones of HC-J4 

differed from the composite consensus sequence by only 7 
and 9 amino acid differences. However, since these clones 
had the same termini as the infectious clone (except for a 
single nucleotide insertion in the 5' UTR of pCV- J4L4S) , 
one or more of these amino acid changes in each clone was 
apparently deleterious for the virus. 
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It was also found in the present study that HO 
J4, like other strains of genotype lb (Kolykhalov et al . , 
1996; Tanaka et al . , 1996; Yamada et al . , 1996), had a 
poly U-UC region followed by a terminal conserved element. 
^ The poly U-UC region appears to vary considerably so it 

was not clear whether changes in this region would have a 
significant effect on virus replication. On the other 
hand, the 3' 98 nucleotides of the HCV genome were 
10 previously shown to be identical among other strains of 
genotypes la and lb (Kolykhalov et al., 1996; Tanaka et 
al., 1996) . Thus, use of the cassette vector would not 
alter this region except for addition of 3 nucleotides 
found in strain H77 between the poly UC region and the 3' 
98 conserved nucleotides. 

In conclusion, an infectious clone representing 
a genotype lb strain of HCV has been constructed. Thus, 
it has been demonstrated that it was possible to obtain an 

20 

infectious clone of a second strain of HCV. In addition, 
it has been shown that a consensus amino acid sequence was 
not absolutely required for infectivity and that chimeras 
between the UTRs of two different genotypes could be 
25 viable. 
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WHAT IS CLAIMED IS: 

1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C virus, said 
molecule capable of expressing said virus when transfected 
into cells. 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid sequence 

10 shown in Figures 14G-14H. 

3. The nucleic acid molecule of claim 2, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 14A-14F. 

4 . The nucleic acid molecule acid molecule of 
claim 1, wherein said molecule encodes the amino acid 
sequence shown in Figures 4G-4H. 

5 . The nucleic acid molecule of claim 4 , 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 4A-4F. 

6. The nucleic acid molecule of claim 1, 
wherein a fragment of said molecule which encodes the 
structural region of hepatitis C virus has been replaced 
by the structural region from the genome of another 
hepatitis C virus strain. 

7. The nucleic acid molecule of claim 6, 
wherein said molecule encodes the amino acid sequence 
shown in Figures 16G-16H. 

8. The nucleic acid molecule of claim 7, 
wherein said molecule comprises the nucleic acid sequence 
shown in Figures 16A-16F. 
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9. The nucleic acid molecule of claim 1, 
wherein a fragment of the nucleic acid molecule which 
encodes at least one HCV protein has been replaced by a 
fragment of the genome of another hepatitis C virus strain 

^ which encodes the corresponding protein. 

10. The nucleic acid molecule of claim 9, 
wherein the protein is selected from the group consisting 
of El, E2 or NS4 proteins. 

10 11- The nucleic acid molecule of claim 1, 

wherein a fragment of the molecule encoding all or part of 
an HCV protein has been deleted. 

12. The nucleic acid molecule of claim 11, 
wherein the HCV protein is selected from the group 
consisting of P7, NS4B or NS5A proteins. 

13. A DNA construct comprising a nucleic acid 
molecule according to claims 1, 3, 5 or 8 . 

14 . An RNA transcript of the DNA construct of 

20 claim 13 . 

15. A cell transfected with the DNA construct 
of claim 13 . 

16 . A cell transfected with RNA transcript of 

claim 14 . 

17. A hepatitis C virus polypeptide produced by 
the cell of claim 15. 

18. A hepatitis C virus polypeptide produced by 
the cell of claim 16. 

30 19. A hepatitis C virus produced by the cell of 

claim 13 . 

20. A hepatitis C virus produced by the cell of 

claim 14 . 
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21. A hepatitis C virus whose genome comprises 
a nucleic acid molecule according to claims 1, 3, 5, 6, 8, 
or 9 . 

22 . A method for producing a hepatitis C virus 
^ comprising transfecting a host cell with the RNA 

transcript of claim 14 . 

23. A polypeptide encoded by a nucleic acid 
sequence according to claims 1, 2, 4 or 7 or a fragment 

10 thereof . 

24. The polypeptide of claim 23, wherein said 
polypeptide is selected from the group consisting of NS3 
protease, El protein, E2 protein or NS4 protein. 

25. A method for assaying candidate antiviral 
15 agents for activity against HCV, comprising 

a) exposing a cell containing the hepatitis C 
virus of claim 21 to the candidate antiviral agent; and 

b) measuring the presence or absence of 

20 hepatitis C virus replication in the cell of step (a) . 

26. The method of claim 25, wherein said 
replication in step (b) is measured by at least one of the 
following: negative strand RT-PCR, quantitative RT-PCR, 
Western blot, immunof luoresence , or infectivity in a 
susceptible animal . 

27. A method for assaying candidate antiviral 
agents for activity against HCV, comprising: 

a) exposing an HCV 
30 protease encoded by a nucleic acid 

sequence according to claims 1, 2, 4, 
or 7, or a fragment thereof to the 
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candidate antiviral agent in the 
presence of a protease substrate; and 
b) measuring the protease 
activity of said protease. 
^ 28. The method of claim 27, wherein said HCV 

protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease . 
10 29. An antiviral agent identified as having 

antiviral activity for HCV by the method of claim 25. 

30. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 27. 

31. Antibody to the polypeptide of claim 23. 

32. Antibody to the hepatitis C virus of claim 
21. 

33. A method for determining the susceptibility 
of cells in vitro to support HCV infection, comprising the 

20 steps of : 

a. growing animal cells in 

vitro; 

b. transfecting into said 
cells the nucleic acid of claim 1; and 

c . determining if said 
cells show indicia of HCV replication. 

34. The method according to claim 33, wherein 
said cells are human cells. 

30 35. A cassette vector for cloning viral 

genomes, comprising, inserted therein, the nucleic acid 
sequence according to claim 2, said vector reading in the 
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correct phase for the expression of said inserted sequence 
and having an active promoter sequence upstream thereof. 

36. The cassette vector of claim 35, wherein 
the cassette vector is produced from plasmid pCV. 
^ 37. The cassette vector of claim 35, wherein 

the vector also contains one or more expressible marker 
genes . 

38. The cassette vector of claim 35, wherein 

10 the inserted DNA sequence contains at least one ORF of the 
HCV genome from any strain. 

39. The cassette vector of claim 35, wherein 
the promoter is a bacterial promoter. 

40. A composition comprising a polypeptide of 
claim 23 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

41. A method for treating hepatitis C viral 
infection comprising the administration to a animal in . 

20 need thereof of a clinically effective amount of the 
composition of claim 40. 

42. A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

43. A method for treating hepatitis C viral 
infection comprising the administration to an animal in 
need thereof of a clinically effective amount of the 
composition of claim 42. 
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GCEAQOaOQC 1GAIQ3GQ3C GACACIDCAC CATCAAICAC TOOOCIGflGA. 50 

GGAACTACIG ' lLTlUA OXA GAAAGCGTCT AQGCA3Q303 TI&GmiGAG 100 

' lUlUUlG CAG C CTO A GG&C OJUUGCIOGC OT3AGAG0CA TACT33ICIG 150 

aSGAAQOQGT GAGIftCftOOS GAATIQOCAG (3AGGACXX3QG 'lULTl'lLTlU 200 

GftTAAACOOG CICAAIGOCT GGAGATITGG QCXJ1G0C3CEC GCAAGACTQC 250 

TAGOOGAGTA GTGTIQ33IC GOGAAAGQOC TIGTOGTACT G0CIGAIRG3 300 

G1U LT1 U CGA <3TGO30CG33 AGSICIOjIA GAG03IQCAC CATCAQGAOS 350 

AAIOCTAAAC CICAAAGAAA. AAOCAAACGT AACACCAACC GTO3X3CACA 400 

Q3A03ICAAG TIC0O333TG G033TCAGAT OGH7IQCT3GA (JITUALTIUI 1 450 

TX30aaaQC3^G GSXXDCTAGA. TIG33TGTQC G330GAO3AG GAAGACTTCC 500 

GAQQ3Gria3C AAOCICGAGG TAGA03ICAG QCTA1CQQCA. -AQ3CAO3I0G 550 

GCCQ3AG33C AQGACCIG33 CTCAGCD033 GTROJLT1UG COXTC EATC 600 

GCAATCAQ3G TlULU^xTCG G0333ATGGC TQC1GICTCC CXCT33CICT 650 

Q33QCTAQCT GQQQaCOCAC AGAULXX03G CGTCAG3I03C GCAATTIGGG 700 

TAAQGTCATC GATAOQCITA. aJlUUUUL.'l T CX3003AOCTC ATO33GTOCA 750 

TAQOQCTOGT O3QCX30aXT C I' lGG ftGOOG CT3QCAQQ3C OCT3G Q3GAT 800 

0303100093 TTCIG3AAGA OQQOC3IGAAC TATOCAACAG GGAAOCTIOC 850 

•B3GTIQCICT TICICEAICT T0CTICTQ3C OOTQL'ICICT T30CT GACIG 900 

tocqcqcttc aqoctaccaa gtqoocaatt octiosoosct TTPCCKIGVC 950 

AO^ATCATr GOOOEAACTC GAGTATIGIG TAGGAQ303G OQGATO3CAT 1000 

OCTOCACACT 00Q333TGIG ' 1 UJ L T HJ J G1 ' TOQOGAG33T AAO30CTO3A 1050 

G3TGTIQ33T QG033TOAOC OOCALUJiGG 0CA0CAQ3GA G33CAAACTC 1100 

CCCACAACQC AQCTICGAOG TCATATCGAT CIGCTIGTOG Q3 AQ0Q0CAC 1150 

CCTCIGCTOG GCU CT CTA03 T33Q3GACCT GT33333TCT GTCTFICTTG 1200 

TIQ3TCAACT GTTTACCTTC TOTCOCAGQC G3CBCT3GAC GAO30AAGAC 1250- 

TQCAATIGTT L ' l iAll'imi C 0Q30CATATA A0333TCATC GCAI QGCA TC 1300 

G3ATATCATC ATGAACT33T C00CTAO33C AGOSTiaSIG GTAQCICAQC 1350 

TQCIOOOGAT CCCACAAGCZ ATCAIQGACA TGAS03CIQG 1GCICACIG3 1400 

QGAGICCTQ3 G333CATAGC GTATITCICC AS03T0333A. ACIQQGOGAA. 1450 

QC3TOCIG3TA. GIQC1GCTQC TATITG0033 O3IOGAO30G GAAAOOCACG 1500 

TXZA00QQQ33 AAATGCCQ3C O30AOCAO33 CTQ330TIGT T33ICT0CTT 1550 

ACACCAQ333 OCAAGCAGAA. CAICCAACTG AICAACAOCA AOOGCAGTIG 1600 

QCACATCAAT AQCACQGOCT T3AATIGCAA TGAAAQQCTT AACA003GCT 1650 

Q3TI3^3CAG3 Q L ' llTlC TAT CAACACAAAT T2AACICTTC AQOCIGIOCT 1700 

GAGAQGTT3G OCAQCIG003 ACQDOTTAOC GAl'l'l'lOXC AG03TIG333 1750 

TOTTATCAGT TATOOCAACG GAAQ0Q3CCT OGAOGAAOQC OCCTACTGCT 1800 

QCO^CTAQCC T3CAAGACCT T3TQ3CATIG TQ0003CAAA GAG03IGTGT 1850 

G3CCCQ3TAT ATIQCTTCAC TCrCAGCCOC GIUJ1UJ1U3 GAAGGACOGA 1900 
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CAGGIO30C3C G33CCTACCT ACAGCIG33G TOCAAATCAT ACQ3A1GICT 1950 

TOGIOCTTAA CAACACCAQ3 CCADCQC'IQS QCAATIG3IT UULjl ' lUia OC 2000 

TOGATCAACT CAACIG3ATT CAOIAAAGIG TGCQ3AQ03C OJULTlUJLUf 2050 

CATO3GAQ33 GIQ33CAACA ACAOLT1UCT CHX3CCOCACT GAT1ULT1UC 2100 

GCAAACAICC QGAAGCCACA TACICIOXfT CO33C\T0333 TOQCIQGATT 2150 

ACAOOCAQSr GCAIQ3ICGA. CTACCCGIAT AGOCmQQC ACimULTIG 2200 

TACCACDCAAT TACACCAIAT TCAAAGICAG GATCTAOGrIG G3AQQ33I03 2250 

AQCACAQQCT GGAAOCQQGC T3CAACIQGA CGCG33333A ACGCTCItSAT 2300 

CD33AAGACA G3GACAGGIC OGAGCTICAGCT OLLJI'IUUIUC 'lUiUJAOCAC 2350 

ACAGIQ32AG UIULTIOCGT GTICTITCAC GAOJL'IUXA QCX.T1U1ULA 2400 

CQQQaCICAT OCACCIOC A C CAGAACATTG TQGAQ3IOCA. GEAL.TiU.LAC 2450 

Q33GfEAQQGfT CAAQCATCQC G1UJ1QJG3GC ATTAAGIQ03 AGEACGIQjI' 2500 

luiLUiuri c criuiUL ' i ' i^ cagacgoqog caiciacroc toltiuiua 2550 

1GA !iUl'i!A CT CATATCCCAA G0QSAGGCO3 CTTIQGAGAA. CXJ1UJJJAATA 2600 

CTCAAIQCAG CATGCCIQ32 CQQGACQC^C GSICTIGIGT ULTlCLUUi' 2650 

(Jl'lLTiUlU C T l'lumiGGT ATCIGAAQC33 TAQ3IG33IG QQQG3AQCQ3 2700 

1CEACQCCCT CEACQ3GATC T3QCCICIQC TOJIGCICCT QJK33J3VLU 2750 

CCTCAQ0Q3G CATACQ2ACT GGACACQGAG GIGCCCGCGT CGIGIGGCG3 2800 

OJl ' lUriUrr GI0Q3GITAA TO30GCIGAC TCIGT033IA TATTACAAGC 2850 

GCTATATCAG CIOJIGCAIG TG3IQ3CTIC AGTALLTi'lLT GACCAGAGTA 2900 

GAAQOQCAAC T3CACGIUIG QJl'lOXCCC CTCAACGICC G3333333CG 2950 

CX3A3D3CX33IC ATCT3SCICA TGIGIGTAGT ACAQCCGACC CIQ3EATTIG 3000 

ACATCACCAA ACTACTCCIG GCCATCTTCXj GAOCCCITTG GATICITCAA 3050 

occAGfrrroc ttaaagiccc ctaltilxjig cojji'iuaag occriurooG 3100 

GATCTO03CG CTAQCG33GA AGAIAQO0Q3 AGSICATEAC GIQCAAAT33 3150 

CCAICAICAA GTTAG333CG CTEACIQ3CA (JL'JLmUlUlA TAACCATCIC 3200 

ACCCCIUTTC GAGACIG3QC QZACAACQ3Z CIQOGAGATC T33033IG3C 3250 

TGIG3AACCA (Jl UL iiUi 'lCT CCCGAATOGA GACCAAGCTC AICACGI03G 3300 

G33CAGATAC CQCCGCGIQC QSIGACATCA TCAACQXTT (JUULUJLUILT 3350 

QCCCGIAQ33 GCCAG3AGAT ACIQCITO33 CCAGCCGACG GAATOGICIC 3400 

CAAG333IG3 AOJriU - 'H33 CX300CATCAC G3CGEACQ0C CAQCAGACGA 3450 

GAG3CCICCT AQ3GTD3EATA ATCACCACOC T3ACTO3CCG QGACAAAAAC 3500 

CAAGI03AG3 GIGAG3IOCA GATO3IGICA ACIGCTACCC AAACCTIGCT 3550 

G3CAACGIGC ATCAAIG333 TATOCTIG3AC T3ICEACCAC Q33Q003GAA 3600 

CGAQGACCAT CQCATCACCC AAQ33ICCTG TCATCCAGAT GEATACCAAT 3650 

GTG3ACCAAG ACCTIUTG33 CTG30CCQCT CCICAAGGTT CCCGCICATT 3700 

GACACCCIGT ACCTQCOQCT CCTQ3GSACCT TEACCiGoIC ACGAGGCACG 3750 

CCGATCTCAT CQQCGAOJIG AIAQCAG3G3 TAGCCIGCTT 3800 

FIG. 4B 
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TCG0C0033C GCATTiacm CTIGAAAGGC TU1T0333QG GTDCDCIQIT 3850 

GIGOQOCX303 G3ACAGGOOG TG33GCTATT CA033CO3GG GTGTGCftOOC 3900 

GTQGAGTGGC TAAAGOG3IG GALTi'UATCC CIGIOGAGAA. 0CBO33ACA 3950 

ACCATCAGAT OOLUJJIUi T CAOQGACAAC '1UJ1CICJCAC CAGCAGTGOC 4000 

OCAGAGCTIC raram snrrr: AOCTGCAIGC TCGGAGaQGC AQDQGTAAGA. 4050 

QCAOCAAG3T OC30G3CIGO3 TACGCAGOOC AQ33CIACAA. UJIUI'IUJIU 4100 

CICAAQ00CT ClUriOl ' IGC AALXJL'IGGGC 'iTlUUiUL'lT ACAJGICrAA. 4150 

Q3Q0CAIGGS GTTGATOCEA. AEATCAGGAC OGGGGTGAGA. ACAATEAOCA 4200 

CTGGCAQCpC CATCADGTAC TCXBCCIAOG GCAAUTBXT TGaOGAGGQC 4250 

GOilGCTC A G GAG3IGCITA TSACKmATA AITIUIGAOG AuIGujaCIC 4300 

CAG3GATOQC ACATOCATCT TGG3CAT0QG CACIGTOCIT GACCAAGCAG 4350 

AGACIG033G G30GAGACTG CJI'IUIULTCJS CTACTGCEAC CGCTGLOJGC 4400 

TOGGICACTG IGTOGZATOC TAACATDGAG GAQ3TIGCIC TGIQZACCAC 4450 

CQGAGAGATC CCCTITEAGG GCAAG3CT&T QQQGCIOGAG GIGAICAAGG 4500 

GGG3AAGACA TCIC MUi ' lVJ TOOCACICAA AGAAGAAGIG CGAOGAGCIC 4550 

GOGQOGAAGC T3GT03CATT GQ3CATCAAT GGOGTGXCT ACTACOQOQG 4600 

TCTIGAOGIG TCIGTCATOC QGACCAG033 GGAIGTTGTC GTOGTGTOGA 4650 

CCGAIGCICT a^IGACTGGC TITAO0G333 ACTTOGACIC TGTGATAGAC 4700 

TGCAACACGT GIGTCACICA GACAGTOGAT TICAGOCTIG ACGCrACCIT 4750 

TACCATIGAG ACAACCAGGC TCX3DGCAGGA T3CIGIUICC AGGACTCAAC 4800 

GCQQGQGCAG GACTGGCAGG GGGAAGQCAG GCAICIATAG ATTIGTG3CA 4850 

CGG3333AGC GOUUL'10035 CA U.GT1CG AC '1UJ1ULU1LC 'ICIGIGAGTG 4900 

CEAIGA03GG Q3CIGTGCIT GGTAIGAGCT CAO30G030C GAGACEACAG 4950 

TEAG3CTACG AGOGTACAIG AACACCOOGG GGLTJ.UCG3T GIQQCAGGAC 5000 

CA l iCi'lGAAT TTTG3GAG33 CGICTTTAOG G3CXTICACTC ATATAGAIGC 5050 

CCA LTI ' ITIA TCCCAGACAA AGCAGAGTGG GGAGAACTTT CCITAQCIQG 5100 

TAQQGTAOCA AGOCA003TG T3Q3CTAG33 CICAAG033C TO33DCAI0G 5150 

1GQGACCAGA. TGTGGAAGTG TTIGAI003C CITAAAGOCA OOCTOCKEGG 5200 

QOCAACAQGC CT3CTATACA. GACTGGGCGC TGTTCAGAAT GAAGICACGC 5250 

TGACGCACCC AATCAGCAAA TACATCAIGA CAIGCAIGIC GGCOGAGCIG 5300 

GAG3TGGTCA OGAGCACCTG GGTGCIOGTr G333333IOC TOGCIGCICT 5350 

GGCQGGGTAT TGO JI U iC AA CAGGC1UCGT G3TCATAGTG QGCAGG ATOG 5400 

'JLITI G IOCQG GAAGOOGGCA ATIAIAOCIG ACAGGGAQGT •IL'lUliAGCBG 5450 

GAGTCCGAIG AGA1GGAAGA UI UL' I U ICA G CACTTAQ03T ACA IOGAGCA 5500 

AG33AIGATC CTQGCIGAGC AGTTCAAGCA GAAG300CIC QQOCICCIQC 5550 

AGACCGCGTC COGCCAIGCA GAGGTTATCA COOCIGCIGT OCAGAGCAAC 5600 

TOQCAGAAAC TCGAGGTCTT TIGGGOGAAG CACAIGIGGA ATTICATCAG 5650 

TOGGATACAA TACTIGGOGG OJL'IGICAAC GCIUX'IGGT AADCCG3CCA 5700 

FIG. 4C 
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TrocncATr GADxaacnrr pcpdjtgccg tcacoggcx: aci3^agcact 5750 

GOOCAAAOOC TGCICTICAA GREATIG33G G33IG33IG3 CTGOOCRGCr 5800 

eX30CGOC300C OJIULUJL'IA ClUULTi ' lUl ' Q33K3CH33C CU ft GJlUUm 5850 

oooacaroos caqostcqga cig33gaag3 tdctosioga c atiltiu ca. 5900 

030003103:: GGGAOJ1UIT CTAOJAT1CA. AGATCAIIGRG 5950 

O33TGAG0IC UGC'lUZAOOO AG3AOCIQ3T CAALLUIUL'IG OOOQCEEroC 6000 

'IL ' ILUUL'IUO AOJUUl'lUiA G1UJLJ1U1UL3 TCIG33CAGC AAXACIO03C 6050 

OOOCAOOnG GOOOQOOOSA O0030CAGIG CAAliUoALLUA. A OJGUL ' i!ft ftT 6100 

AGU L ' nU U CC TO0333333A ALXJAaUlTlU OD3CA03CAC TAOGIUJJUS 6150 

AGftOD3A3QC AGCDOOOaOC GICACIOXA TACTTAGZAG OCICAC'IUIA 6200 

ACQCAGCIDC TGAQQOGACr QCMCAGIG3 ATAAGCTIXG 6250 

'PXftilTaiTDC GOTTOCIOOC TAAQ3GACAT CEQ33ACIQG ATAIQ03AQG 6300 

TGCIGAQQGA. C1T1AAGAOC TQ3CIGAAAG CGAAQCICAT GGCACAACIG 6350 

GCIGOGATIC OCITIGIGIC CIQOCftGOOC G33TATAQ33 GQ3ICIQQ33 6400 

AQGAGA033C Ai'JmOJACA CIOGCIQOCA C1GTO3AGCT GAGA1CACIG 6900 

GACASGICAA. AAA0333AOG AIGAGSA2QG TOGGTOCEAG GAOCTQCW33 6950 

AACATCIQGA. GIQ3GAD3IT CQOCAITAAC G3T33VCAOCA. a33QQ00CIG 6550 

TAL'iULUJJLT anaOOOOGA ACTATAAGIT 0300010103 AQ33IGTCIG 6600 

CAGAQ3AAIA. GGIG G AGATA AQQ33Q3IQG QOGACTIQCA. CEAOOTKTOG 6650 

G3TA1GACIA. CIGACAATCT TAAAIQODCG TG3CAGA2QC CATO303D3A 6700 

AITITI CACA. GAATIQGAQ3 Q33IQO30CT ACACAQGflTT QOGCCCCCTT 6750 

QCAAQOCXnT QCTIQOOOOAG GAGOTAICAT TCAGAGTAQ3 ACICCAOGAG 6800 

TAC0GQGIG3 GGTD33ZAATT AOCTIQ33AG OO33AA0333 AOOTAGODOr 6850 

GTTIGA03TCC AIG^ICACTG ATCXTTOQCA. TATAACAGCA. GAG3O33003 6900 

QGAGAAQOTT G333AGAG0G TCA LUJUL ' IT CTAIQ33ZAG CIUL'IUUGLT 6950 

AQQCAQCIGT COGC'IO CA TC TCICAAG3CA. ACITOCACOG OCAAOCATCA. 7000 

CimZCIGAC GODGAGCICA. TAGAG3CTAA. OTICCIGTGO AG3CAG3AGA. 7050 

T303O3QCAA CAICAQCAG3 GTIGAGTCAG AGAACAAAGT OGIGATICIG 7100 

GftL' l U LT l Uj AliOJUL'l'lG'l 1 G32AGAG3AG GATCAG0Q03 AGOICTOOOT 7150 

ACXZD3CAGAA ATICIG33GA. AGIC'ICQGAG A I ' I CG DO OQS GODCIG0003 7200 

1CIG33CXXX3 GD333ACTAC AAO0O0Q33C TAGTAGAGAC GD33AAAAAG 7250 

0CIGACTAO3 AADCAOZLT3T GGTDCAIG3C 'IUUUUUCUAC CAGCTOCAOS 7300 

GIOICCIG CT G'lOl CI COGC CIDOGAAAAA. GaSTBGQOIG GIOCICAQOG 7350 

AATCAACCCT AICTPCIGGC TIQ3303AQC TIOOCAGCAA AAGTTTIQ3C 7400 

AQC'ICCICAA. CTI0O33CAT TAQ30333AC AAIAOGACAA CATOCICIGA 7450 

QCCQ30CCCT TCTOOCIQQC CCCOGGACIC GGALXJI'IUAG TCCTATICIT 7500 

CCAIODQCQC CCT33AQG3G GAQOZD3333 ATCQ3GAIUT CAQO3A033G 7550 

TCATGCTCGA. CQGICAGTCAG T3303Q03AC A03GAAGATC T03TGTQCIG 7600 
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CICAAIGICr TALL'ILC'IGGA. CAQ303CACT OSICAO03DG TQ03CIQ0QG 7650 

AAGAACAAAA. AL'1GG3 CA TC AA03CACIGA GCAACIOJET GCIAQ3CXIAT 7700 

CACAATCIG3 TGTEATTOCAC CacnCAOQC ACT3CITGCX: AAAQ3CAGAA 7750 

GAAAGICACA TTIGACAGAC TGCAAGTICT GGACAGOCAT TAQCAQGACE 7800 

•1QCICAAGGA GGICAAAGCA G03G03ICAA AAGTGAAG3C TAACTIQCTA 7850 

'1U-XJEAGAQ3 AAGCTIGCAG CUJESAOGOOC QCSO^TICAG OCAAAIOCAA. 7900 

mTlUCJ CTKT GGGGCAAAAG A ULJIUOJI ' I G GCATOOCAGA. AAQGOIErlAG 7950 

QCCACATCAA. CIDDSIGIGG AAAGAOCTTC: TGGAAGACAG TGTAACAOCA 8000 

MBGAGACIA QCA11CAJQ3C CAAGAA03AG GTTTICIGCG TICfiGOCIGA 8050 

CaA033333T OSEAAGCCAG CIOSICICAT CUIUI'IOXC GAQCIQ3Q0G 8100 

1QQGGGIGIG QGAGAAGATC QCXTT3EAD3 AOGTGCTTAG CAAQCIOXC 8150 

CIGGOQGIGA TGGGAAGCIC CI7033ATTC CAAIACICAC CAGGACAGOG 8200 

QGTIGAAnC CIOGTGCAAG G3IGGAAGTC CAAGAAGADC COGA1G333T 8250 . 

TCIOJCAIGA. TJ^OOOGCIGT TTIGACTCEA CAGTCACIGA. GAG0GACA2C 8300 

OGTACQGAGG AGGCAATTTA CCAATOTIGT GACCIGGAOC QOCAAGaoaG 8350 

CGTG3CCATC AAGTCOCICA CIGAGAQQCT TEATCTIQGG Q3CXCIUITA 8400 

GCAATTCAAG GGGGGAAAAC T3333TTACC GCAGGTG003 CGGGAGQQGC 8450 

GTACIGACAA. CIAQCIGTG3 TAACACGTIC ACTIGCIACA TCAAGOOaCG 8500 , 

QQCAGGCIGT CGAGG03CAG GGCICCAGGA. CIGCAOCATC CID3IGIGIG 8550 . 

GGGAOGACTT AGICGTTAIC TGTGAAAGTG 0QQ3Q3IDCA GGAGGAO30G 8600 . 

GQGAGCCIGA. GAQCCTB2AC GGAGGCEAIG PCCPOJYPCT OCGCCQQQCC 8650 

CQQ3GACCCC CCACAACCAG AAIACGACIT GGAQCJITAIA ACATCA1GCT 8700 

CCICCAACGT GICAGiasaC CAC3GAO3C303 CIQ2AAAGAG GGICI&CIAC 8750 

CnaOJOSlG ACOZLACAAC aOOOCiaXB AGAGOT33T Q3GAGACAGC 8800 
AAGACACACT CCAGTCAATT OCIG3CTG3 CAACAIAA1C ATCTITQGQC 8850 
CCACACIGTG GGGGAQGAIG A2ACIGAIGA OOCAH'i'lL'lT TAQOGIOCIC 8900 
ATAGOCAG33 A1CAQCTIGA ACfi UJL ' lUri ' AACIGTGAGA. TCEA03GAQC 8950 
CIGCTACICC ASAGAAGCAC T33A1CIAOC TCCAAICATT CAAAGACTCC 9000 
A1G30CICAG CGCATITICA CTXACAGTT AL'ICItXAGG TGAAAICAAT 9050 
AGG3TG3CGG CAIGOCICAG AAAACTIG33 GIOQQQQQCT T30GA0CTIG 9100 
GAGACACCQG CCCCGGAQ03 TCXD303CTAG GCTICIGTO: AGAGGAQQCA 9150 
G3GCT3GCAT ATCIG3CAAG TACCTCTICA ACTG3GCAGT AAGA ACAAAG 9200 
CTCAAACICA CTCCAATAGC QGCCXSTIGGC: CG3CIG3ACT T3IO0QGTIG 9250 
GTTCACX33CT Q3CTACAQ0G GG3GAGACAT TEATCACAGC GTGICICAIG 9300 
CCGQGGCCCG CEQGTICIGG TTTIGQCTAC TQCIGCTO3C T3CAGQQ3TA 9350 
Q3CA1CEACC TCCTCCCCAA 0QGAIGAAQ3 TIG333TAAA CAC1O33G0C 9400 

tcttaagcca ' lTiojiuiTi ' TrmTriTr TTTrrrmT Trrr iurrr r 9450 
TrmTiLTr Tarmccrr crrrrrriuu titctittic ccticittaa 9500 
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' l GC ylUUC'IUC AlUrUAGOOC TAGICAOQQC TAGCT3IGAA AQ3TO03IGA. 9550 
GGOQCAIGAC T3CAGAGAGT GCIGM3VCTG GOCICICIGC AGATCATCT 9599 
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MSINEKPCPK TKKNINRRFQ DVKFPQG3QI VGGVYLLPFR GPKLCMRASR 50 

KISEESQPK3 RRQPIEKABR PK3KIWAQPG YPWPLYGNEG OaaftGWLLSP 100 

PGSRPSW3PT DPRBRSRNLG KVIDIUIOGF ADLM3YIELV GAPLQSAARA 150 

IAH3VFVLED GN^YATGNLP GCSFSIFIIA LLSCLTVPAS AYgVKNS9GL 200 

YHVINDCENS SIVYEAAEAI LKTFGCVFCV REENASRCWV AVTPTVAIRD 250 

aapngi^R hdxlvgsat dcsalyvgcl ogsvelvgql ftfserkhwt 300 

TQDCNCSIYP GHTIGHFMAW EM4MNWSPIA. ALWAQLLRI PQAIMEMIftG 350 

AHW3VIAGIA YFaMoNWRK VLWTiT.T.FfiG VDAEHfl/IGG NAGKITftGLV 400 

(3JLTEGRH3Sr IQL3NINGSW KENSTAUQCN ESLNIGWLAG LFYQHKFNSS 450 

' GCPERIASCR PISYAN3SGL IERFXCWEOP PRPGGIVPAK 500> 

SVOGPVYCFT PSPWVGTID FSGAPTYSWG ANDIDtfFVLN NTRPELGNWF 550 

GCIWMNSIGF TKVOGAPFCV IGGVGNNTLL CPIDCFRKHP EATYSROGSG 600 

FWTTFROWD YPYRLWHYFC TI2SJYTIFKVR M¥VGGVEHRL EAAOXWIBGE 650 

paXEEKERS ELSPLLLSTT QWQVLPCSFT TLPALSIGLI HLH3SHVEM2 700 

YLYGVGSSIA SWAIKWEXW LLFLLLADAR VCSOJfoMtfLL ISQAEAALEN 750 

LVHJSIAASLA GIH3LVSFLV FHZEAWYLKG KWVPGAVYAL YGM/^IPLLLLL 800 

LALPQFAYAL DTEVAASOGG WLVGLMALT LSPYYKRXIS 850 

VPPLNVRGGR DAVXLLMTW HFILVFDITK LIIAXPGELW 900. 

3U2&SLLKVP YFVRVQGLLR ICALARKIAG GHW^IAIIK D3ALT3TWY 950 

NHLTPLRDWA. HNGLRDLAVA VEPWFSHME TKLTIW3ADT AAOGDIIN3L 1000 

FVSAEEGQEI L1GPADGMVS KGWKLLAPIT AYAQQIKGLL GCUTSUIGR 1050 

EKNQVEGEVQ IVSEA.TQTFL ATCINGVCWT VYK3AGTKIT ASFKGFVK24 1100 

YINVDQDLVG WPAPQSSRSL TPCICGSSDL YLVTRHAD7I PVFRRGDSBG 1150 

SLLSPRPISY LKGSSQGELL CPAGHAVGLF FAAVCTRGVA KAVEFIPVEN 1200 

LCTIMRSPVF TENSSPPAVP QSPQVAHLHA FI G3C3CSIK V PAAYAAQGYK 1250 

VLVLNPSVAA TD3FGAYMSK AH3VDPNIRT GVRiTl'lGSP TTXSTXGKFL 1300 

ADQQCSGGAY DIIICDEEHS TDATSXLGIG TVLDQAEIAG ARLWLATKT 1350 

PPGSVTVSHP HEEEMALSTT GEIPFYGKAI PLEVIKGGRH UPCHSKKKC 1400 

DELAAKLVAL CTNAVAYYPG m/SVIPTSG DWWSIEftL MIGFIGETOS 1450 

VIDCNICVT2 TVDFSLDPTF TIETITLPQD AVSKIQRRGR TGFGKPGIYR 1500 

FVAPGERP9G MFDSSVLCEC YDAGCAVJYEL TPAETIVPLR AYMNITGLPV 1550 

CQEHLEFWEG VFIGLTHTDA HFLSQTKQSG ENETYLVAYQ ATVCARAQAP 1600 

PPSWD3WKC UERLKPTLHj PIPLLYRLGA VQNEVTLTHP ZEGOMICMS 1650 

ADLEWISIW VLVGGVLAAL. AAYCLSIGCV VIVGRIVLSG KPAIIPEBEV 1700 

LYQEFDEMEE CS^ILPYTEQ GMMLAEQEKQ KALGLLQIAS FHAEVTTPAV 1750 

QINWQKLEVF WAKHMWNFTS GIQYLAGLST LPGNPAIASL MAFEAAVTSP 1800 

LTTGQTLLFN ILGGWVAAQL AAPGAATAFV GAGLAGAAIG SVGLGKVLVD 1850 

IIAGYGAGVA GALVAFKEMS GEVPSTEDLV NLLPAILSPG ALWGWCAA 1900 
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HERHVGPGE GfiVQWMISKLiI AFASR3XIHVS PHOVEESm AARVEAUSS 1950 

L1VTQLLSRL HQWISSECTT FC93SWLBDI WDWICEVLSD FEOVJLKAKLM 2000 

PQLPGIFFVS CQFGYRGVWR GDGIMHIPCH 0GAETH3iVK fcCIMRIVGPR 2050 

•TCRNMWSSTF PINAYITGPC TPLPAPNYKF ALWRVSAEEY VKEREM3JEH 2100 

KCPOQEPSEE FFIELCGVRL HRFAPFCKPL IPEEVSHFWG 2150 

IHEXFVGSQL PCEPEPIMAV LTSMLTDPSH nfcEAAGRRL APGSPPSMAS 2200 

SSASQLSAPS LKA1L.'1ANHD SPDRELIEHN LUWRQEM33M riWESENKV 2250 

VXLESFDPLV AEEEEREVSV PAEHRKSRR FARALPVWAR PDYNPPLVET 2300 

WKKPDXEPPV VH3CELPPPR SFPVPPPRKK RIWLTESTL SIALAELAIK 2350 

SPGSSSTSGI TGTNTITSSE PAPS3CPPDS IKZESYSSMPP T.H»!KJLMJL 2400 

SDGSWSIVSS GADIEEKA/CC SMSYSWIGAL VTPCAAEEGK LPINALSNSL 2450 

LRHHNLWSTT TSRSADCPQK YVTFDBIQJL DSHYQCWIKE VKAAASKVKA 2500 

NLLSVEEACS LTPPHSAKSK FGXGAKEVRC HARKA.VAHIN SVWKDLLEDS 2550 

VrPLUlTlM A KNEVFCVQPE KGC33KPARLI VFPDLGVRVC EKMALYDWS 2600 

KLPLAVM3SS "5CTQYSP0QR VEFLVQAWKS KKTPM3FSYD THZFDSTVTE 2650 

SDIRIEEAIY QCCCLDPQAR VAUCSLTERL YVQGPLHNSR GEN33¥FPCR 2700 

ASGVLTTSGG NILTCYIKAR AACRAAGLQD CIMLVG3CDL WICESAGVQ 2750 

EDAASLRAFT EAMIRYSAPP GDPPQPEYDL ELTTSZSSNV SVAHDGAQCR 2800 

VYYLTRDPTT PLARAAWEIA RHTPVNSWD3 NIIMFAPTLW ARMELMIHFF 2850 

SVLIARDQLE QALNZEEYGA. CYSIEPLDTP PIIQRIB3LS AFSLHSYSPG 2900 

EINRVAACLR KLGVPPLRAW RHRARSVRAR LLSEQGRAAI QGKHjFNWAV 2950 

RIKLKLTPIA AAGRLDL9QW FTAGYSGGDI YHSVSHARPR WFWF TT . T . T.T A 3000 

AGVGIYLLPN R 3011 
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GGCAQQOOO: 1GAIG93QQC GACACICCAC CAIGAATCAC TCXXXTOIGA. 50 

QGAACIACIG TCTICAOGCA GAAAQCJGIUT AGOCATOXG TEAGIMGAG 100 

TCICGIQCAG (XTOCAQGftC COOCCCTCCC GQGAGAQOCA TAGIGGICIG 150 

CQGAAQ033T GAGIACACGG GAATIQOCAG GAQSAOQGGG TOLTJL'ILTJLU 200 

GATCAAOQOG CTCAAIQCCT QGAGATITQG QCXJTQdOOCC GOGAGACTOC 250 

TAQCOGAGIA CTGTIGQGIC G03AAAQQ0C TIGIGGfEACT QOCIGATAQG 300 

GK3CTIQQGA GT3COCQGQG AQGTCIOJCft. GAOOGIQCAC CATCAGCA03 350 

AATCCIAAAC CICAAAGAAA AAQCAAAOST AACAQCAACC GCD93ZCACA 400 

GGA03ICAAG TICECX3QQOG GTOGTCAGAT OGTIQGIQGA. CTITA0CT3T 450 

T3Q0Q0QCAG QQQQCCCAQG TldQSIGIGC QCX3DGACIftG GAAQQCTICC 500 

GAGCG3ICGC AAQCIGGIQG AAQ30GACAA OCJimOXAA AQQCIOQQOG 550 

ACOGGAQQQC AGG3CCIQQ3 CICAQaCOGG CTAOOCTIGS COOCTCEAIG 600 

GCAATGAG33 GCTQ3G3IQG QCAG3AJQ3C TGCiGTCACC OOGCGQCICC 650 

CX3QOCTAGIT QGQ32COCAC GGACXXXDCGG CCTAGCTOGC GIAACTIQQG 700 

TAAGGICATC GATAQCCITA CATGCX3QCIT CGGQGATCIC ATQQGGEACA 750 

TTCCGCTOCT OGGQOCCCQC CTAQQ93Q0G CIGOCAGQGC CTIQQCACAC 800 

GCTCTCCG3G TICIQGAQGA CQGOGIGAAC TA3QCAACAG GGAACTIGOC 850 

cgctiqcict tictctatct tqctctiqqc tctgcigtoc TCITIGACCA 900 

TCCCAQCTTC CX3CTTAIIGAA GIQCX3CAAD3 TGICCQQGAT AD^OCAiTOTC 950 

ACXSAAOGACT QCTCCAACIC AAQCATTGJIG TA3GAQQCAG CGGAOCTGAT 1000 

CATCCATACT G2CGGGTD3CG TOQCCIGIGT TCAGGAGQGT AACAGCIOCC 1050 

GTTOCTGGGfr AGQQCTCACT 0CCAO3CIOG OGGOCAGGAA T3GCAOCX3IC 1100 

CCCACEACGA CAATACGAQG CCAD3IOGAC TIX3CID3TIG QGAQGQCT3C 1150 

TTTCTOCrCC QCTATCTACG TOOGOGATCT CIGOQSATCT ATTTIOCIOG 1200 

TCTCOCAGCT GITCACCTTC TOQCCIOGCC G3CATCAGAC ACTGCAQGAC 1250 

TGCAACIGCT CAATCIATCC 0G3QCATCIA TCAGGICAOC GCA3GGCITG 1300 

QGATATCATG ATGAACIQGT CACCEACAAC AGOXTAGIG GIGItGCAGT 1350 

TCCIOOQGAT CCCACAAGCT GICCT3GACA TQGIGQ0Q3G GQOQCACIQG 1400 

QGAGIOCIQG CGQQQCTIQC CTACTATIOC ATG3IAQ3GA ACIGQQCEAA 1450 

QGTICIGATT GIQ3QQCTAC TCTTTQ003G COTTCACX3QG GAGACCCACA 1500 

OGAQGQQGAG GGIQGQQQQC CACACCACXTT C0Q9GTICAC GTOOLTITIU 1550 

TCATCTQQQG CGIUICAGAA AATCCAGCTT GTGAATACCA AOQQCAQCIG 1600 

a^O^TCAAC AGGACIQQQC TAAATTQCAA T3ACTC0CIC CAAACIGGGT 1650 

T LTl ' lU COGC QCTJl'l'l'lAC GCACACAAGT TCAACID3IC CQ0CT3CCO3 1700 

GAGCGCAIQG OCAQCTQOGG CXXTATIGAC TOGIT03CQC AQQQGflQQQG 1750 

CCCCATCACC TATACIAAQC CTAACAQCIC G3ATCAGAQG (XTEATIGCT 1800 

QGCATEAOQC QCCICGAQCG T3IQGIGI0G TACC0Q03IC QCAQGIGIGT 1850 

QGICGAGIGT ATIGTTTCAC CCCAAQQOCT G1TGTOG1GG GGAOCACOGA 1900 
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GIXXCTAQC3T ATAQCIQ33G GGAGAA1GAG ACAGAD3IGA 1950 

-IGCICCICAA. CAACAO30C3T OOQOCACAftG GCAACIGGTT (JUUL'IGHACA 2000 

TOGATCAAIA CTACIGGGTT CACTAAGACG 1GCX3GAQ3IC lUJJJlGTAA 2050 

CATCQ33333 GIOGCTAftGC QCAOCITGftT CTG332CAD3 GAL'IULTILC 2100 

GGAAQCAOOC CGAQXHZACr TACACAAAAT GIG3Cia3QG UUUL'lULJl'lG 2150 

acaqctaggt gsctagtaga cr&aocAaac Aoajmujc aciaq3dct3 2200 

GACIGTECRRT TITICCATCr Tl^fiGGITftG Gg^iUlALLUm ULJcUUuJiGS 2250 

AG0003CT C^ASQ033CA TOCfcATIGGA CIOGAQ3AGA GDC3CICTAAC 2300 

TIQ3AGGACA. GQGKEAQGIC AGAACTCAQC OT3CT3CTQC 'l G'IC'Ift CftAC 2350 

AGAGIG3CAG ATACIQ33CT UIULTl'lCAC CACXXHSmS GC TITKEOC & 2400 

CTGCTTIGAT CCKTCTCCKT CAGAACATOG T33AD3IGCA AS&LL.'1U1!AC 2450 

Ub'l UI AGGST CAQ aJl'l'lUr CTCCTTIGCA ATCAAA3G33 AGTACATCCT 2500 

Gi'lUL'l'l ' l'lL Cl'lULUC'lGG CAGAQ3CX3CX3 OTIGIGIGOC TGCITGTQGA. 2550 

TGATGCIGCT GAIAGCCCAG GCIGAG3Q0G OCTTAGAGAA CTIGGIG3IC 2600 

CTCAATOQGG CGTCCGTGGC 033AG03CAT G3TATICICT UL'iTlLTlUl' 2650 

GTICTICTQC GCOGCCIGGT ACATTAAQ33 CAGGCTOGCT 0003330X3 2700 

CX3TATOCTIT TTAIGGOGIA TQQ303CIGC TCCTOCICCT ACIGGCCTIA 2750 

CCACCAGGAG CTIAD30CTT GGA003QGAG AT30CTQCAT O3IGO30QQG 2800 

T3CQ3ITCTT GTAQGTUIGG TATTCTIGAC CTIGTCAOCA TACTACAAAG 2850 

lUl ' l'iL ' llA C TAG3CICATA TOJIUJI'IAC AATACTTTAT CACCAGAQQC 2900 

GAG3CGCACA TOCAAGTCTG GGTCXXOGOC CTCAAOSTIC GQG GAQ3G0G 2950 

CGATOOCATC ATCCICCIGA CGTEGIQCX3GT TCATCCAGAG TTAAU.TJ.TlG 3000 

ACATCACCAA ACICCIGCIC QCCATACIOG GCQ03ZTTCAT GCTGCIQCAG 3050 

OTX3GCAIAA CGAGAGTGGC GTACTimiG OGOSCICAAG GGCTCAXTQG 3100 

TGCATSZATG TTAGTGOGAA AAGT03O33G G3CJICATEAT GIOCAAATOG 3150 

TCTTCATGAA GCT333CG0G CIGACAQGTA CGTAQGTLTTA TAADZATC3T 3200 

ACCOCACTQC GQGACIGGGC CCAQGQ3332 CTAD3AGAOC TIQ0 G3IGGC 3250 

GGTAGAGCCC GTG3ICITCT COGOCATQGA. GACCAAQ3IC ATCAOZD333 3300 

GAQZAGACAC 03CEGOGTGT GGGGACATCA TCTIQQGTCT ACmnUICC 3350 

GCCCGAAQ3G Q3AAQGAGAT ATTITTGGGA. CXIGGCIGATA GIUILGAAGG 3400 

GCAAQGGTG3 OSACIDCTIG CX90O2ATCAC G32CTACIOC CAACAAAOGC 3450 

GGGGCGTACT TGGITGCATC ATCACIAQOC TCACAQSCOG GGACAAGAAC 3500 

CAQGTCGAAG GQGA03ITCA AO' I UJITILT ACXDGCAACAC AA3CTTTCCT 3550 

033GAGCB3C ATCAAOGGQG TGTG^IGGAC TGTCTACCAT GQCX3CIQ0CT 3600 

CGAAGACCCT AGCCGGTOCA AAAGGTOCAA TCAOOCAAAT GEACADCAAT 3650 

GTAGACCIGG ACCTOGTOG3 CTGGCAGGOG CCOCCQGG33 CQOQCTGCAT 3700 

GACACCATGC AGCIGTGGCA GCIOQGACCT TTACTTGGTC AQGAGACATG 3750 

CIGATGIC^T TCCGGTG33C CG3CGAGGOG ACAQCAQQGG AAGTCIACTC 3800 
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itxoocftQac cxxJiciccm qcig&aaggc iuluuleoig Giacan GCT 3850 

T IUGCCTIOG GGGCAOGTOG T33303IdT CLUJLjC ' IGCT G1GICJCA0CC 3900 

0aQ33GfIO3C GAAG303GIG GACTICAIAC CDGTIGAGTC TAIGGAAACT 3950 

A0C3OGC3Q3T C1ULUU1 L T1 ' OCAGACAAC TCAAUXOX" OGSCIGIAaC 4000 

GCBGACATK: CAACjlGGCAC ATCTEGCAGGC TDCTACIG3C AGOGGCAAGA 4050 

GCAOCAAAGT GCXX33CIQ33 TAIGCAGOOC: AAGGGTACAA. UGI UCIOSI C 4100 

CIGAAOOOGT OOGTIG00GC CBlXTiaGGG TT3X33C3303T ATALLUIULAA 4150 

GQCACACGST ATOGAOOTIA ACATCAGAAC T33GGTAAQG ALLm'iAOCA 4200 

C0G30Q3CIC CATTA03EAC TOCACCTATC GCAAGTTCCT 1GCXX3A09ST 4250 

Q3L. '1 U1'1L' 1 G GGGSC30CIA T3ACATCATA ATA3GTGAIG AGTOOCACIC 4300 

AACIGACIDG ACTAGCA1CT TGOGCATOGG CACAGTOCIG GACCAAGG3G 4350 

AGACQSCIGG AGOGOGGCIC UIUUIUL-'ILU CXSCOGCIAC AOCTO033GA 4400 

TO3GTIA0DG TGOCACAGOC CAATATOGAG GAAATA030C 1GIUCAACAA 4450 

1GGAGAGATC GCCTICIATC GCAAAGOCAT CCOGATIGAG QGCATC ZAAGG 4500 

G3GGGAGGCA. TCICATTTIC TGCCATIOCZA. AGAAGAAAIG TGAGGAGCIC 4550 

GOCQCAAAGC 1GACAG30CT CGGACIGAAC GCIGTAGCAT ATTACCX333G 4600 

CCTIGATCIG TG03ICATAC CQOZCATO33 AGAGGTCGTT GTCGTQGCAA 4650 

CAGACGCIUT AATCACGGGT TICA003G0G ATTTTGACTC AGTGAI 03AC 4700 

TGCAATACAT GTGTCACOCA GACAGTOGAC TICAQCTIG3 AiaXAOCIT 4750 
CAOCATIGAG ACGAOGAOOG TC32QCCAAGA. OGOGUIGICG OGCIOGCA AC 
GGOGAGGTAG AACT3GGAGG GGTAQGAGIG GCMCIACAG GTTIGTCACT 
CCAGGAGAAC GGCCCTO333 CATCITOGAT TCTICGGTCC 1GTGIGAGIG 



4800 
4850. 
4900 



CIMGACGCG GGCTGIGCIT GGTAIGAGCT CADGGCDCOCT GAGAGCTO3S 4950 

TTAGGTIGCG QQCITACCIA AATACAQCAG GGTIGCCQGT CIGCCAGGAC 5000 

CA1CK3GAGT TUIQGGAGAG CGICTICACA GGOZICAQCC ACATAGAT3C 5050 

CCACTICCIG TCCCAGACTA AACAQGCAGG AGACAACTIT CCITACCIGG 5100 

1GGCATATCA AQCIACAGTG TGQQOCAGGG CTCAAGCIOC ADCIDCATOG 5150 

1GGGACCAAA TGIGGAAGIG TUICATACGG CIGAAAGCTA CACTGCA03G 5200 

GCCAACAOCC CIGCIGTATA GGCIAQGAGC OGTOCAAAAT GAGGTCATCC 5250 

TCACACACCC CATAACTAAA TACATCAIGG CAIGCATGTX: GGCIG AGCIG 5300 

GAQ3IOGTCA CTAGCAQCTG GGTGZEGGTA QQQ3GAGTOC TIGCAQCTIT 5350 

GGCCGCATAC TGCCIGACGA CAQGCAGTGT G3TCATIGTG G GCAQG ATCA 5400 

1CTIGTCCGG GAAGOCAGCr GTCGTICQQG ACAGGGAAGT OTIUIAGCAG 5450 

GAGTTCGATG AGATO3AAGA GIGTGOCTCA CAACTTOCIT ACATDGAQCA 5500 

GGGAAIGCAG CTCGCCGAGC AATICAAGCA AAAGGQGCIC GGGTIGTIGC 5550 

AAAGG3CCAC CAAGCAAGCG GAGGCIGCIG CICCCX3IGGT GGAGICCAAG 5600 

TOGCGAQCCC TIGAGAGCIT CIGGQOGAAG CACATGTGGA ATTICATCAG 5650 

CQGAATACAG TACCTAQCAG GCITATDCAC TCIGCCIOGA AACC0030GA 5700 
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TAQCATCATT GATOOCATIT ACAQLTIL'IA TCACTAGDOC OTCAOCACC 5750 

CAAAACAOQC TOCT3TITAA CA1LTJLUGQG QGAIQ33TGG C1QXXJAACT 5800 

O3CIOOT00C AQQ3CTOQGT CAQCTTiaST GQQOSCmSC ATOGOCOSftG 5850 

CQ3L-'1U1'IUG CM3ZZm>£33C CTI033AAQ3 IQL'lUJiGCA CA2CTIQ303 5900 

G3CTAnX33QS CAG333TAQC: Q3333CACTO GIGQOLTl'iA AQGICAIGAG 5950 

G3G0GAQ3IG CC3CIDCACCX3 AOGAOOTG3T C&ACTEftCIC G11GGJALLLXJ 6000 

TCICICCIG3 1Q00CIQ3IC GT03333T03 TGIt3CX3GAQT AATACTGGGl' 6050 

Q00CX333AGA. G32Q0CTGIG CAGIQC3A3GA. 6100 

AQ ULjl ' IC GCr 10333333m AlDCAQGICIC 00CIAQ32AC TR3GIQQCIG 6150 

J^GAQ03A03C T3CAO3A03T GTCACICAGA TCL'IL'IUUAG CCmOCATC 6200 

ACTCAACIGC TCAAG033CT CXZAD3AGIG3 ATI3^A!IGAQ3 ACIGCICTAC 6250 

Q3ZALLG3ICC G3CID3IGC32 TAAG33ATGT TIG33ATIG3 ATATOOAOSS 6300 

T3TTGACIGA. CTICAAGACC T333IOCAGT OCAAACIOCT G303333TIA 6350 

CCG33ACT2C Cl ' l ' lU- ' lUlU ATO3CAA03Z G33TACAAG3 GAGICIGQQ3 6400 

GQGQ3AQ3GC ATCA3QCAAA O3A0CTQQ3C ATGOGSAOCA CAGAIO3003 6450 

GACATGTCAA AAA033TICC ATGAQ3ATCG TAG333CTAG AAQCIG2AGC 6500 

AACA03T03C ACQ3AAQ3TT C03CATCAAC GCAIACAQCA O33GA0CTIG 6550 

CACACCCTCC COQSGQaOCA. ACimi'CCAG Q3032TA1G3 O333T30CTG 6600 

CIGaGGAGIA CGTQGAG3TT A03CGJIG'IG3 Q33ATTIOCA CTAQ3TGA0G 6650 

G3CAT3A0CA CT3ACAAOGT AAAGTOODCA. TC3CAG3TTC 0333COD33A 6700 

ATICTICAQ3 GAG3TQGAT3 GACT3QG3TT GCACAG3TAC QCID33333T 6750 

GCAAACCIUT TCTAQ333AG GA03TCAQ3T TCCAQGTD3G GCTCAAOCA* 6800 

TACTIQ3ID3 QGIOGCAGCT CCCATGCGAG CCO3AAC033 AOGTEAACAGT 6850 

QCTIACTia: AT3CICACCG ATCCCTCCCA CATIACAGCA. GAGAQ33CTA 6900 

AGCGTAG3CT GG3TAGAQ33 TCTCCCCCCT CTITAGCCAG CICATCA33T 6950 

AQCCAGTIUT CTGOGQCTIC TTIGAAG333 ACATO3ACTA Q3CAOCATCA 7000 

CIOQCQ33AC GCIGAQCTCA T3GAQ33CAA CCICTT3TQ3 CG3ZAG3AGA 7050 

TQ3G033AAA CATCACTOSZ GIQGAGICAG AGAAIAAG3T AGTAATICTG 7100 

GA L 'l L ' l 'l' lL G AALX.U_.T1LA CG33GAQ33G GAT3AGAQ3G AGATATD33T 7150 

Q3333333AG AT33I033AA AA.TCCAG3AA CTTOQCCTUA G03TTG3CCA 7200 

TATQ332ACG 0CO33ACTAC AATCCIOZAC TCCIAGAGIC CT3GAAQGAC 7250 

CCGaACIACG TCCCTC033T GGTACAQQ3A T3CCCATIGC CAOCTAOCAA. 7300 

GGC1CCIDCA ATACCACCTC CAQ33AGAAA GAQ3AQ33TT GIOTIGACAG 7350 

AATCCAAT3T CT LT I L 'I G CC TTO3333AG2 T033ZACTAA GAQCTID33T 7400 

AG3T0333AT O3TCX330CGT TCATAQGGQC A03333AO33 OOLTIUL'ILA 7450 

CCTQ3CCTCC GA03AD33T3 ACAAAGGATC Q3AQ3TIGAG TC3TACICCT 7500 

OLAT300GCC 03TIGAAG33 aAGC033333 AQ3Q33ATCT CAGQCaAQGQG 7550 

TCIT33ICTA CCGTX-ACT3A G3AG3ZEAGT GAG3AT3T03 TCI03IQCTC 7600 
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AATCIOTIAT AOGTQGACAG G03CXZCIGAT CAOGCCA1GC GCIQOGGAGG 7650 

AAAGTAAGCT QOXATCAAC COGTIGAGCA ACTCTITGCT GOGTCAaCAC 7700 

AACA3X33ICT AGQCCACAAC AIODOGCAGC QCZAAG3ZTIDC G3ZAGAAGAA 7750 

GJECAGCITT GACAGATIGC AAJJIUL'IUGA T3A1CATIAC CQQGA03iaC 7800 

TCAAGGAGAT GAAQGCGAAG GOJICCACAG TTAAG3CTAA GLTlUUMid' 7850 

AIAGAQGAGG CCTGCAAGCT GAOGGCCCCA CATIOGGOGA AAIOGAAEIT 7900 

TOGCIMGOG GCAAAQGAOG TCOGGAAOCT ATOCAOCAGG G003ITAAGC 7950 

ACATCCGCTC OGTGTQGGAG GA LTIUL ' IG G AAGACACTGA AACACCAATI 8000 

GACACCAOCA TCA1GGCAAA AAGTCAGGTT TICIQ33TCC AAOCAGAGAA 8050 

GGGAG3CCGC AAGCCAGCTC GULTliAlUyr AT1ULCAGAC CT33GAGTIC 8100 

GIX3IA1GCGA GAAGAIQGCC CTTTAGGACG TGGTCTOCAC ULTIUL'IUAG 8150 

GOOGTGA3GG QLllCICAIA CDGA2TICAA. TACTODOCCA AGCAG CGG3T 8200 

CGAUl'lCCIG GTGAAIACCT GGAAATCAAA GAAA1GCCCT AIGG GLT1LT 8250 

CATATCACAC OOQ L ' lUl ' l ' l'l ' GACTCAACQG TCACTGAGAG TGACATKGT 8300 

GTIGAGGAGT CAATTIACCA AIIGTIGTGAC TIGGCCCCCG AS33CCAGACA 8350 

G3CCATAAGG TCGCICACAG AGCGGCTTIA CAT33GGGGT OQCCTG ACTA 8400 

ACTCAAAAGG GCAGAACTGC GGTTATCGCC GGTGCCGCGC AAGTGGCGIG 8450 

CTGACGACTA GCIQ333TAA TACCCICACA TGTIACTIGA AG3XACIGC 8500 

AQCC'IGTCGA QCTGCAAAGC TCCAGGACTC C^CGAIGCTC GTGAAOGGAG 8550 

ACGACCTIGT CGTTATCIGT GAAAGOGCGG GAACCCAGGA GGAIGCGQCG 8600- 

GCCCTAOGAG CCTICAOGGA GGCTATGACT AGGTATTOQG OC GCCOOOGG 8650- 

GGATCCGCGC CAACCAGAAT AGGAGCIGGA GCTGAIAACA TGAIGTTOCT 8700 

CCAATGTGTC AJ3IGG0GCAC GA.T3CA.miG GCAAAAGGGT ATACTAOCTC 8750 

ACCCGTGACC CCACCACCCC CCTIGCACGG GCT333TG3G AGACAGCIAG 8800 

ACACACTCCA A3CAACTCTT GGCTAGGCAA TATZAICATG TA1GCGCCCA 8850 

CCCTATG33C AAGGATGATT CTGAIGACIC ACTTTTICIC CATCOTCTA 8900 

QCTCAAGAGC AACTIGAAAA AGCCCIGGAT TGTCAGATCT A0G333CTIG 8950 

9000 

9050 
9100 



CTACTCCATT GAGCCACTTG ACCTACCICA GAICATIGAA CGACICCATC 
GTCTTAGCGC ATITACACTC CACAGTTACT CICCAG3TGA GATCAATAGG 
GTC3GCTICAT GOCTCAGGAA ACTIGGGGTA CCACOZTTGC GAACCIGGAG 

ACATCQ33CC AGAAGTGTOC GCGCTAAGCT AL'IUICCCAG GG3GGGAGGG 9150 

CCGCCACTIG TG3CAGATAC CTCITIAACT GG3CAGTAAG GACCAAGCTT 9200 

AAACICACTC CAATCGCG32 CGCGICOCAG CIGGACTTGT CTGGCTQGTT 9250 

OGTGGCTGGT TACAGCGGGG GAGACATATA TCACAGCCIG TCICGTGCCC 9300 

GACCCCGCTG GTTICCGTTG T3CCTACTCC TACTTCUIGT AG3GGTAQGC 9350 

ATTTAOCTGC TCCGCAACGG AT3AACGGGG AGCTAACCAC TCCAQ3CCTT 9400 

AAGnZATTIC CIGTITTTTT 'I ' lTl ' l ' l ' l ' l ' l ' l ' TITITITITT 'lUl'l'l'lTl'l'l 9450 

■ ,, T , r , , , 1 ,, 0 ^ 7 , r^ rcrricrr7 TTTIOCTTIC TTITTDGCTT CTTTAAT3GT 9500 
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G3CTOCAICT TPGCCCmJT CAD33CEAG2 IGIGAAAQ3T GCGIGftGOOS 9550 
CATCACIQCA GAGftGIQCIG A3ZACIQ30CT L'lUIGCAG&T CATCT 9595 

FIG. I4F 
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MSINPKPCRK 1KFNINRRPQ UJKFBG33Q1 VQGVXLLERR GPRLGVRATR 50 

KASER9QERG RRQPIPKRRR PEGRftWAQPG YPWPLYGNEG D3tfAGWLL£P 100 

K3SKPSW3PT DPKHH SHNLG KVIDTLTOSF AEOGKIELV GAPLQ3AARA 150 

GVNYATCNLP GCSFSIELLA LLSCZLTTPAS AYEVRNVSGI 200 

YHVINDCSNS SIVYEAAD7I MHIPQCVFCV QEENSSROWV ALTPTLAAKN 250 

ASVPTITIRR HVDLLVGIAA K^AMYVGEL. Q3SIELV9QL FTFSPRRHET 300 

VQDCNZSIYP GHVSGHRMBW EMMMMdSPTT ALWSQLLRI PQAVVIMUftG 350 

AHWGVLAGLA YYSMVO^WAK VDGEIHITGR VAGHTISGFT 400 

SLFSSGA9QK IQLVNINGSW HTNKEAI20J LFiffiHKFNSS 450 

QCFERMASGR PHWE^QONG PHYIKPNSS DC^IEYCKHYA PRPOGWPAS 500 

QV03PVYCFT PSPVWGTID RSGVPTYSW3 ENEIEM4LLN NIRPPQGNWF 550 

GCIWMNSIGF TKT033PPCN IGGVO^ILI CPTECERKHP EAIYEK03SG 600 

PWLTPRCLVD YPYFXIrtHYPG TLNFSIFKVR MYVGGVEHRL. 650 

PQgUECRERS ELSPLLLSTT EWQTLPCMT TLPALSTGLI HLH3NHVEI/Q 700 

YLYGVGSAFV SFAIKWEYIL 1JLFLIJ1ADAR VCACLWtfMLL IAQAEAALEN 750 

LWLNAASVA GAH3IXSFLV FFCAAWYIKG KLAPGAAYAF YGVWPLLLLL 800 • 

LALPPFAYAL DREMAASOGG AVLVGLVFLT LSPYYKVFLT KLIVMJ3YFI 850 

TRAEAHM3VW VPPLNVRGGR DAIUXTCAV HPELXEDITK LLLAILGPLM 900. 

VLQAGITKVP YFVRAQGLIR ACMLVRKVAG GHYVI^IVFMK LGALTCIWY 950 

NHLTPIKDWA HAGLRDIAVA VEPWFSAME TKVTIWSAUr AACGDIIB3L 1000 

PVSAKRGKEI FL/3PADSLEJ3 Q3WRI1APIT AYSQ3TRGVL GCIITSLTGR 1050 

WSTATQSEL ATCENGVCWT VYH3AGSKTL AGPKGPZTO 1100 

YINVDLBLVG WQAPPGARSM TFCSQGSSDL YLVTRHAU7I PVRRR3DSRG 1150 

SLLSPRPVSY LKGS9Q3PLL CPSGHWGVF RAAVCTRGVA KAVDFTPVES 1200 

MEITMRSPVF TCNSTPPAV? QTPQVAHLHA PTG9GKSIKV PAAYAAQGYK 1250 

VLVLNPSVAA TLGFGAYMSK AHG3DPMERT GVRi'l'l'lGGS ITYSIYGKFL 1300 

ADGGC93GAY DIIICDECHS TDSTTTLGIG TVLDQAEIAG APXWLATAT 1350 

PPGSV/IVPHP NIEEIGLSNN GEIPFYGKAI PIEAIKGGPH LTFCHSKKKC 1400 

DELAAKLTGL GE23AVAYYRG IZVSVIPPIG EWWATDAL MIGFIGDFDS 1450 

VIDCNICVIQ TVEFSLDPTF TTEHTIVPQD AVSRSQRFGR TGPGRSGIYR 1500 

FVTPGERPSG MFDSSVLCEC YDAQCAWYEL TPAETSVPJJR AYLNTPGLPV 1550 

CQEHLEFVES VFIGLTHEDA HFLSQIKQAG I3SFPYLVAYQ AIVCAPAQAP 1600 

PPSWDQMtfKC LIFXKPTLKG PTPLLYKLGA V^SEVTLTHP TIKYIM&CMS 1650 

ADLEWISIW VLVGGVLAAL AAYCLTTGSV VIVGRIILSG KPAWPEREV 1700 

LYQEFDEMEE CASQLPYIEQ GMQLAEQFKQ KALGLU^TAT KQAEAAAPW 1750 

ESKWRALETF WAKHMaZNFIS GIQYLAGLST LFGNPAIASL MAFTASITSP 1800 

LTTQNTLLFN IDGGv^AAQL APPSAASAFV GAGIAGAAVG SIGLGKVLVD 1850 

ILAGYGAGVA G&LVAFKVKS GEVPSTEDLV NLLPAHSPG ALWGWCAA 1900 
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10 20 30 40 50 

17>^^7R90 17.34567890 17.^fi7R90 1234567890 1234567890 

HjRRHVGPGE GKVQWMNRLI AEASRGNHVS PUKVEESDft. AAKVIQILSS 1950 

LTTIQLLKRL HJWINEDCST PCSGSWLKD7 WDWICIVL1D FKIWLQSKLL 2000 

PRIJGVPFLS CQPGYKGVWR QXJEMQTICP CGAQIAGHVK aGSflR IVGER 2050 

1CSNIWH3IF PINAYTIGPC TPSPARvKSR ALWRMAAEEY VEVIKVGDFH 2100 

YVTGM37IENV KCPO^PAEE FFIEVDGVFL HKYAPACKPL I^ECWTPQVG 2150 

LN2YLVGSQL FCEPEFD7IV LTSMLTDPSH IIAETAKRKLi AHGSPPSLAS 2200 

SSASQLSAPS LKATCITHHD SPDAIXjIEAN IXWRQEM33N ZERVESENKV 2250 

VELDSFEPIH AE3EREISV AAETLRKSRK FPSALPIWAR PEM^PPLLES 2300 

WKDPDYVPPV VH3ZPLPPTK APPIPPPEKK KrVVLTESOV SSALAELAIK 2350 

TTOSSGSSAV DSGUffiALED IASCDGEKGS DtfESYS34PP T.rtTf . K iUiAJb 2400 

SDGSWSIVSE EASEDWOCS MSXTWIGALI TPCAAEESKL PINPI£NSLL 2450 

RHHNMrarr srsaslsqkk vihkl quld hgetwikem kakasivkak 2500 

LLSIEEACKL TPPHSAKSKF GYGAKDJRNL SSPAVNHIRS VWEDLLEDTE 2550 

TPinnTMAK SEVPCVQPEK QGRKPARLIV FPDLGVFVCE KMALYEW5T 2600 

LPQA.VM3SSY GPQYSPKQRV EFTJVNIWKSK KCPM3FSYDT RCFDSIVIES 2650 

DIRVEESIYQ CCDLAPEARQ AIRSLTERLY IG3PLTNSK3 QND3XFRCRA 2700 

SGVLTISQQJ TLTCYLKAIA ACKAAKLQDC TMLVN3XLV VICESAGIQE 2750 

DAAALRAFTE AMTFCfSAPPG DPPQPEYDLE LTTSCSSNVS VAHDASGKKV 2800" 
YYLTRDPTTP 1ARAAWETAR HTPINSWLGN IIMffiPTLJWA RMHMIHFFS 2850 
ILLAQEQLEK ALD02IYGAC YSIEPLDLPQ IIERLB3LSA FTLHSYSPGE 2900 
INRVASCLRK H3V/PPLKIWR HRARSVRAKL LSQ3C3RAATC GFYLFT^IAVR 2950 
TKLKLTPIPA ASQLDLSGWF VAGYSQGDIY HSLSRAKPFW FPLiZLLLLSV 3000 

GVGIYLIjPNR . - . ■ 3010 
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#2. Strat gy for c nstructing chimeric clone of HCV (pH77CV-J4) 
which contains the nonstructural region of strain H77 and the 
structural region of strain HC-J4 



5'UTR 



delete 



pCV-H77C 



C+E1+E2+P7 ^NS2 4- NS3 ♦ NS4A+ NS4B ♦ NSSA ♦ NS5B 
AM 



3'UTR 




Age\ (156) O. I (710) 1 *f 47 < Wunt en * 2851) 

y * GCA TAC GCA 

l-A TftT G . . 

A/de I * (2763) 



pf^tR products 



Hind\\\-Nd&±Afl\\ 

(7B62) (9160) (9403) 
gc G ATA T Gt 
gcr ATA TGt 

H78S1S— ♦ 



->C/a lA/cte I Eco 47 III 

Y \ A 



Fusion PCR 

H9140S(P-M) 




H2751S (Clal/Ndel) H2B70R 

x/ioi(2282) Nde I * (2763) 



pCV-J4L6S 



J4 2271S 



J4-2776R(Ndel) 



1. Fragment A, B, C and D ; PCR amplification from pCV-H77C or pCV-J4L6S 

. Fragment A ; additional Cla I site, artificial Nde I site induced by a single mutation 
(C-»T at nt 2765 of H77C) and authentic Eeo47 III site 

. Fragment B and C ; eliminated Nde I site by a single mutation within the primers 
(C-*T at nt 9158 of H77C) , and fusion PCR with both fragments 

. Fragment D ; artificial Nde I site induced by 2 point mutations within the primer 
(T-»A at nt 2762 and C-*T at nt 2765 of J4L6S) 

2. TA cloning of PCR products 

3. Sequence analysis 

4. Cloning of Fragment A (Cla \-Eco 47IU ) and Fragment B/C (Hind lll-Aff II ) with correct 

sequence Into pCV-H77C 

5. Complete sequence analysis of new cassette vector fpH77CV"l , into which the structure 

regions of different genotypes can be inserted. 

6. Cloning of Fragment-Age l/Xho I (cut out from P CV-J4L6S) and Fragment D (Xho \-Nde I 
with correct sequence into the new cassette vector ; 3 piece ligation 

7. Complete sequence analysis of 1a+1b chimera [ P H77CV-J41 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 
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GCCAQOGCCC TGAIGGGGGC GACACICCAC CATGAAICAC TO00CP311A 50 

QGAACTACIG TCTICAOGCA GAAAQ03TCT AGCCAIQ303 1TOCTAIGAG 100 

T GICUXUCa G OCIDCAGjAC CaJ GU L 'lUU L ' GQoAGAQQCA. TAGT 13J1U1U ISO 

COGAAOCOCT GAGIACA033 GAA l ' lOXA G GAO3AO0333 lULTl ' lLTlU 200 

GATCAA0Q03 CnCAATOOCr Q3AGATTIG3 QlGIGLmJC G33AGACIGC 250 

TAQCXD3AGTA. GIGTP333TC QOGAAAQ3CC TTU1UUJJACT GCCT3AIAG3 300 

U1ULT1UOA Giaam333 AGGTCIOGTA GAC03IGCAC OmSAGCAOG 350 

AATOCIAAAC CICAAAGAAA AA0CAAAO3T AACAOCAAOC GCD30DCACA 400 

GGAOSICAAG TIU3333333 GTO3ICAGAT CUI'IUJIULA (JITlMUL'lUr 450 

T3CO303 2 AG GQG333IAGG TIGGGTGTGC G033GACIAG GAAG3CTT0C 500 

GAG03GJICGC AAQ1TCGIGG AAQGQ3ACAA OCTATCQZAA. AG33D03GC33 550 

ACOGGAQ33C AG333CT333 CICAGOCCGG CTALULTiLG CQX'IL'IAJG 600 

OCAAIGAQQG CCT33G3TGG GCAGGAIGGC 'lUUlUlLAOC Q33333ZD3C 650 

CGG33TAGTT GQQ30002AC GGACC0333G OGTCAGGTO3C GTAAL.T1UJS 700 

TAAQ3TCAIC GATftOJLTiA CAT3333CTT CGCCGA3CIU ATG333TACA 750 

TIOJGCIDjT Q3Q030GG3C 0170330303 CIGQCAQ33C CTT33CACAC 800 

G3T3TC033G TTCIGGAQ3A (333D3TGAAC TA2GCAACAG GGAA LT1ULXJ 850 

T333AGCTIC CG3TTATGAA GTO03ZAACG T3T33333AT AIAOCAIGIC 950 

ACGAAOGACT QC7ICCAACIC AAQCATIGIG TAT3AGGCAG CQGAQ3TGA.T 1000 

CATOCATACT CUXU3TGOG T33CCT3T3T T3AQ3AQ33T AACAGCTOOC 1050 

Ul ' lU L' iU JJ l' AG03TICACT CCCACGCTD3 CG3CCAQ3AA T3QCAG03TC 1100 

CCCACTAD3A CAATACGACG (3ZA03TO3AC TIGCIU3TIG G2A033CIOC 1150 

TUIODCAGCr GTICAU-T1C laaXTCOCJC G3CAIGAGAC AGTGCAGGAC 1250 

TOCAAC'IGC T CAAICTATOC CX330CA3Gm TCAGGTCACC GCALiUJLTIG 1300 

GGAIAIGAIG ATGAACIGGT CACCIACAAC AG333TAGTG U1U1UUCAGT 1350 

T3CIO033AT CCCACAAG3T GICGIGGACA TO3T333333 G3CCCACT33 1400 

GGAGT C C 1 UG CJULUULTiUC CIJiCJIATICC AIGGTAGGGA ACIG33CIAA 1450 

GGTICIGATT GT03332TAC ' HJl'l'lU L UL L; CGTIGAD333 GAGAQ3CACA 1500 

O3A03333AG G3TG3CC33C CACAOGACCT OGG3GTICAC GILU-TITIL' 1550 

TCAICIGGGG OSTCICAGAA. AAIDCAGCIT GTGAAIAOCA ACX33CAGCIG 1600 

GCLACAICAAC AG3ACTG2CC TAAATIGCAA TGACICmiC CAAACI033T 1650 

TCITIQ003C QCTIGTTITAC GCACACAAGT TCAACIOSTC 0333103033 1700 

GAGG3CATGG CCAGC1Q0CJS CCDCATIGAC T33Tia3333 AG333TQ333 1750 

CCCCATCACC TATACIAAGC CTAACAG3TC GGAICAGAGG GLTim'lULT 1800 
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G3CATTA03C GCCIOGAGOG T3IG3IGIDG TACCGXGIC QCAQGflGIGr 1850 
GGTCCAGIGT ATIGITTCAC COCAAGCXTT GTIGIG3IGG QGACEAOOGA 1900 

' imricoas r gtoctacgt atagcigggg ggagaaigag acagagoiga 1950 



G3AAGCACCC CGaGGCTACT TACACAAAAT CJIUJL'IUUULS UJUUlUbTlU 2150 

ACACCTAGGT GCXTIAGTAGA CTACCCATAC ALJJL'l'l'lUJC ACEWDOOCIG 2200 

CA L'IUICA AT TlTlLLMC r TE^LLJl'l^G GAliUimUlU UJJJJLU1UG -2250 

AQCACAGGCir CAATGCCGCA TGCAATIGGA CTO3AGGPGA GCGCTGTAAC 2300 

TIGGAGGACA GGGATAGGIC AGAACICAGC OJUL'IGC'IUC TGIdBCAAC 2350 

AGAGTGGCAG ATAL1UCCCT GIQCTTICAC CACUJ1ACCG CJLTi'UKlULA 2400 

CIGGTTIGAT CO&CTCCAT CAGAACATCG TGGAUJ1ULA AmJL'IUUftC 2450 

GGIGT&GOGT CAG OUlTlUr CIDCITIGCA ATCAAATGGG PGTM^OCT 2500 

GTlUL ' iTi ' lU Cl 'lL ' lUl TC G CAGAO3G303 Q3TGIGIQCC TGLT1U1UA 2550 

T3A1GCTGCT GATAGCCCAG GCIGAGGCOG CCTEAGAGAA CTIGGIGGIC 2600 

CTCAA2GCGG CGTGCGTGGC CGGAGCGCAT GSIM'XCICT (JLUTlLTlUf 2650 

GTCCTICTGC GCCGCCTGGT AC^TEAAGGG CAQ3CIG3TT CCTG3GGCG3 2700 

CGTATOCTTT TIA1GGCGEA T33003TTOC TGCTQCICCT A£TIGGCGTEA 2750 

CCACCACGAG CATATGCACT GGACACGGAG GTGGCCGCGT OGIGTGGCGG 2800 

OJl ' lUl ' lUrr GTOGGGTTAA TGGCGCTGAC TCTGICQOCA TATEACAAGC 2850 

GCTAIA3CAG L ' lU JlLCAIG TG3TD3GCTIC AJim'lTlLT GACCAGAGTA 2900 

GAAGOGCAAC TGCACGTGTG GGT1LIOXC CTCAAOGTCC QGGGGGG303 2950 

GGATOCXXJIC MCTI30CA TGTGTGTAGT ACACC33ACC CT GGTATH G 3000 

ACAICACCAA ACTA C1UJ1G GCCA1CTICG GACCULTriG G AT1LT11AA 3050 

GCCAGTTIGC TIAAAGICCC CTALTiOJlG GGCJGT1LAAG GCTICIGOG 3100 

GAICIGCGCG CTAG0333GA AGATAGCCGG AGGICATTAC GTGCAAAT33 3150 

CCATCAICAA GTIAGGGGCG CITACIOXA CCTATGTGTA TAACCAJCTC 3200 

ACCCCTCTIC GAGACTGGGC GCACAACGGC CTOCEAGA3C 'lUJ XGIGX 1 3250 

TGIGGAACCA (J1LG1LT1LT CCCGAAIGGA G^O^AAGCTC AlC AUiii^ 3300 

QQGCAGATAC CGCC30GTOC GGTGACATCA TCAACGGCTT GU- UJIL'ILT 3350 

GCCCGTAGGG GCCAQGAGAT AL'IULTIGGG CCAGCCGACG GAA1UJ1U1L' 3400 

CAAGGGGIGG A QJI'IUJIGG COCCCATCAC GGCGTACGCC CAGCAGAGGA 3450 

GAGGCCTCCT AGGGTGTATA ATCACCAGCC TGACIGGCCG GGACAAAAAC 3500 

CAAGIGGAGG GTGAGGTCCA GA1CG1GICA ACIGCTACCC AAAOCTTCCT 3550 

GGCAACGTGC ATCAAT3GGG TATGCIGGAC TCTTCTACCAC GGGGCQQGAA 3600 




2000 
2050 
2100 
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OGAGGACCAT CQCZAICACCC AAG3GTCCIU TCATOCAGAT GTA1AOCAAT 3650 

GTrGGAGCAAG AGLTIUIUGG CIG3CXD0GCT GCICAfiOGnT CGCGCTCATT 3700 

GACACQCTGT ADCTGGGGCT OCTOGGACCT TTA C C1UJ1L' AOGAGGCAOG 3750 

CGGAIGTCAT ILXJLUiGCGC CX3GOGAGGTG AIAGCAGGGG TAGO-'lULTl' 3800 

OCfi iiTlUJiA CTIGAAAGGC T0CIO330GG GTGCGCIGTT 3850 

GIQ0000003 GGftCAOSCDS TGGGOCTATT CAGGGOOGCG GIG1GCAOX 3900 

GTX3GAGIGGC TAAAGOGGIG GACTTEATOC UiUluGAGAA CCTAGGGACA 3950 

AOCAIGAGAT OLUJULjlUlT CAOGGACAAC TCCICTCCAC CAGCIAGIODC 4000 

CCAGAQCTIC CAQGT3GCGC AGCTGCAIGC T0CCAOD33Z AGGGLsEAAGA 4050 

QCACCAAGGT QDQ3GCTGOG TAGGCAGCOC AGGGCTOCAA GGHCTIG3TC 4100 

CTCAAGOOCT ClUriUL ' IG C AAG3C1GGGC TTTG3IGTIT ACA1UIUUAA 4150 

GGQCCAIGGG GTIGALLUJ1A AIAICAGGAC GGGG3IGAGA ACAATEAOCA 4200 

CIQGCAGOOC CATCACCTAC T03CCIA03 GCAAGriOLT TOGOGAGGGC 4250 

QGGIGCICAG GAOJIUJI'IA TGACATAA2A ALLT1U1UACG AGTGOCACIC 4300 

CACG3ATOCC ACAIOCAIUT TGGGCAIOGG CAL'lUlUJl'r GACCAAGCAG 4350 

AGACTGOGGG GGOGAGACIG GTIUIUL ' ILU OCAL'iLC'JJAC OZCICCGGGC 4400 

1U03I C ACIG T3TOOCATCC TAACATCGAG GAUJl'IGCIC TGTOCAOCAC 4450 

CGGAGAGAIC GOCTITEAGG GCAAGGCEAT OOGCCTOGAG GTGATCAAGG 4500 

G333AAGACA TCIC ALLUl ' lC T33ZACICAA AGAAGAAGTG CGAOGAGCIC 4550 

GC03QGAAGC T33TCGCATT GGGCATCAAT GOOGTGGOCT ACTAGO30GG 4600 

-IUITGACGTG TCTGICATCC GGAGCAG033 CGAIGTIGTC GT03IGID3A 4650 

CGGAIGCIUT CATGACTGGC TTTAO03333 ACTIUGACIC TGTGATAGAC 4700 

TGCAACAOGT GIGTCACICA GACAGTGGAT TICAGOCITG ADZXTTAGCTT 4750 

TACCATIGAG ACAAOCAQGC 1Q03Z3CAGGA TGCTGICTOC AGGACTCAAC 4800 

GCGGG3QCAG GACTGGCAGG GGGAAQQCAG GCAIUTATAG ATTIGTGGCA 4850 

COQGGGGAGC GOCCCTCGGG CAIGTIGGAC 'lUJlUCGTCC TCIGTGAGTG 4900 

CTA3GACGOG GGCIGTGCIT GGTATCAGCT CAOGG033GC GAGACEACAG 4950 

TIAQ3CTACG AGOGTACAIG AACACOGCGG GGCTICOOGT GIGOCAGGAC 5000 

CAICTIGAAT TITGGGAGGG GGICTTEAGG GGOC^CACTC ATATAGAIQC 5050 

CCA LTl'l ' im 1GOCAGACAA AGCAGAGTGG GGAGAACTIT GCTEACCIGG 5100 

TAGOGTACCA AGCCACGGIG T3CGCTAG33 CICAAGOGOC TOXCGAIOG 5150 

TGGGACCAGA TGIGGAAGTG TIT3AIOCGC CTEAAAGGCA UUL'ICCMQS 5200 

GCCAACACCC CTGCTAIACA GACTGGGG3C TGTTCAGAAT GAAGICAGOC 5250 

1GACGCACCC AATCACCAAA TACAICATCA CA3GCAIGTC GGOJGA GCIG 5300 

GAGGTGGICA GGAGCACCIG GGTGCTDGTT GGOGGOGICC TGGCIGCICT 5350 

GGOCGOGTAT TGC C1U1CA A CAGGCIGOGT GGTCATAGIG GGCAGGAT0G 5400 
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SCnGICCOG GAftQOGQXA ATIAIAQCICG ACAGGGAGGT 'lL'lL'JJACCAG 5450 

GAGTiaGAIG AGA1QGAAGA GIQCICICAG CAdTftaDGT ACATO3AGCA 5500 

AGGGAIGAIG CIU3C7PGAGC AGTTCAAQCA. GAALOXHIC Q30CIOTGC 5550 

AGACX30O3IC 00GOCASGCA. GAGGTIATCA. lUJ L'lUL'lG T 5600 

TGGCAGAAAC TO3AGGTCTT T 1G330 GA AG CACATCIQGA. ATTICA1CAG 5650 

TOG3ATACAA. TALT1G3333 GCCIGICAAC QL'lUUL'lLUr AAGOOOGOCA. 5700 

TlUJl'lCA TT G KIUJUITI T ACAGO'lOOOG TCAOCAGQa: ACTAAOZACr 5750 

GGCCAAAGO: TOCICTICAA CAIATIGGGG (JJJ1UJJ1U3 C1G0CCAGCT 5800 

QGCJoaaaaoc ajiuuuJUiA ciuulvltiu t qggigctggc ceaul'iuggs 5850 

CC0CCAIO33 CAGOGTIGGA. C1GGG GA AGG TCL'lLLflGGA. CATICTIGCA 5900 

GG3TAT330G CGGSOGTGGC GGGAGCICIT GTAGCATICA. AGA1CAIGAG 5950 

CQSIGAQGIC <J U L ' 1UC AG3G AGGACCIGGT CAATCIGCIG CXTOOCAIO: 6000 

■lUlUUUL'iQG AG UULT1U1A GIOGGIGIGG TCIG03CAGC AATACIG03C 6050 

CQGCACGTIG G0CC3333GGA GQ3GSCAGTG CAAIG3A3GA AC033CEAAT 6100 

AQGCnOQCC TO00QG33GA AOCATCTTTC QCCCACQCAC 6150 

AGAGGGATOZ AGOCaCQOGC GTCCACTGOCA TACTCAGCAG OTCACIGTCA 6200 

ACGCAGCIO: TGAGGOGACT QCATCAGIGG ATAAGCI03G AGTGTACCAC 6250 

TCCAIGCICC aJl ' lLL ' lGG C TAAGGGACAT CTG33ACIGG AIATGOGAGG 6300 

TGCIGAQOGA. CTITAAGACC T33CTGAAAG CCAAGCICAT GOCACAACIG 6350 

CCIG3GATIC CCTTIGTGTC CTGOGAG03C G3GTAIAG3G G3GTCIGG0G 6400 

AGGAGACGGC ATEAIGCACA. CTD3CIGOCA. CTGTG3AGCT GAGATCACIG 6450 

6500 



GACAIGTCAA AAACG3GACG ATSAGGASOS T033I0CTAG GACCIGCAQG 
AACA1GTGGA GTGGGAGGTT CCOCATIAAC GOCTACAQCA CG33Q00CIG 6550 

6600 

6650 
6700 



TA L'lUL UL' l ' l ' CCIGCGCCGA ACTATAAGTT C333J1U1U3 AG33TC?ICIG 
CAGAGGAATA GGTGGAGATA AGGOGGGT3G GQGACTT32A CTAOGTATCG 
GOEAIGACTA CTGACAAICT TAAAIGCCQG TGCCAGAICC CAID3CQCGA. 

ATTTnCACA GAATIG3AOG QLJJ1ULQCXIT ACACAGGTIT GOGCOCCCIT 6750 

GCAAGQCCIT GCIG033GAG GAGGTKTCAT 1CAGAGTAGG ACIUZAOGAG 6800 

TALXJLUJ103 G3IC3CAATT AOCTIGOGAG O0QGAAG033 AGGraGOOCT 6850 

GTIGA03TCC AIGCICACTG ATCQC7IQCCA TA.TAACAGCA GAU^LUX 6900 

GGAGAAGGTT G3GGAGAGGG TCAQOOOCTT CTA3G30CAG CKCT033CT 6950 

AGCCAGCIGT COGCTCCA.TC TCICAAGGCA ACIT3CACCG OCAAOCATCA 7000 

CTCCCCIGAC GCTGAGCICA TAGAGGCTAA. COTCT3TGG AQ3CAGGAGA 7050 

1GGG033CAA CAJItZACCAGG GTT3AGTCAG AGAACAAAGT GGTGMTOG 7100 

GA L ' ILLTILU ATCOGCITGT GGCAGAGGAG GAIGAGOQGG AGGTCTOOGT 7150 

ACCTGCAGAA ATICIG03GA AGTC1CQGAG ATIO3CC03G GOCCTGCCQG 7200 

FIG. ISD 
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1CIQ3333GG GC033ACEAC AAD3XXZD3Z TAGTAGAGAC GIGGAAAAAG 7250 

AAOOM X ' lUl ' QGTIOCA3D3C TGOJLXJLTAC CAOJ1UCAD3 7300 

UlU X CIOCr GIGOCroOQC: CTGGGAAAAA ULUJLAOQ3IG &TCCTCA003 7350 

AA3CAAC0CT A!ICTACIQCC TIGGOCGAGC T1UGCACCAA AftUl'ITlUUL: 7400 

iO JlUL ' JO A CITCCQ3CAT TAO33303&C AAZEACGACAA CA2CCTCTGA 7450 

QCXD33a00CT ' lUlUJL ' lO X OOOOCGfiCIC CXaAOGTIGfiG 'lUL'JJAli'lLTl' 7500 

CCTGGAGGGG GAG0CIG333 ATOD3GA1CT CAG33A0333 7550 

l UA lOJ I OG A CGG7ICAGTAG TGGGGOOGAC A03GAAGKIG TOiiUlUL ' lU 7600 

CICAA1GTCT TATTCXTIGGA CAG3CGCACT CGICAOJOCG 'lUCJGL'lUUULJ 7650 

AAGAACAAAA ACIG00CA2C AA03CACTGA 0CAACIO3IT OnftCGCCRT 7700 

CACAATCIGG TGTATIOCAC OCTICAC3Z AGTGCTIGOC AAAGGCAGAA 7750 

GAAAGTCACA TTO3ACAGAC TGCAAUriLT G3ACAGC3CAT TAOCAGGAQG 7800 

1QCICAAOGA GGICAAAQCA QLUJLU1LAA AAGTGAAGGC TAALTTdLTA 7850 

TCCGTAGAGG AAQCTIGCAG CCIGAO330C: CTACATICAG OCAAAIOCAA. 7900 

GnTOQCTAT QGGGCAAAAG AOGICOTTG OZASGCCAGA AAGGOCGTAG 7950 

OOCACATCAA CTG0GTGIG3 AAAGACJLTIL: -IGGAAGACAG TGTAACAGCA 8000 

ATAGACACTA CCAICA3QQC CAAGAA03AG GITTICIQCG TTCAQOCIGA 8050 

GAAGGSSGGT CGTAAGCCAG CILU1LT CA T OJIUI'IOCCC GAOCTQGGOG 8100 

TGCX3 03I GI G CGAGAAGATC QCaCTOEAOS ADjIQGTTEAG CAAGC'IULCL' 8150 

CIGGOCGIGA T33SAAGCTC CIACQGATIC CAATACTCAC CAG3ACAG0S 8200 

Cm ' lU AftTIC CTCGTGCAAG OGflGGAAGTC CAAGAAGAOC CCGA2GGQ3T 8250 

TCICGTAIGA TACCCGCIGT TITGACDCCA. CAGTTCACEGA. GAGOGACATC 8300 

AGGCAA2TIA CCAATGTIGT GACCTGGACC OGCA AGOCOS 8350 

CGIQ3CCAIC AAGTTCCTCA CIGAGAGGCT TEKIUI'IUJG ULLUJILTJA 8400 

CCAATICAAG GG33GAAAAC T0OC33CrACC GCAGG\D3Q33 OX3GAG033C 8450 

GTACTGACAA. CIA0CT3IO3 TAACAODCIC ACT1UCEACA TCftAQ30QCG 8500 

Q3CAOJCTGT O3AQ303CAG G3CIOCAG3A CIQCAOCA3G Uiu^iuivaiG 8550 

QCGA03ACTT AGTCGTEASC TCIGAAAGTG CX33333IOCA GGAGGA OQOG 8600 
GOGAGQCTGA. GAGCXTTICAC GGAG3CTAIG ACGAQSEACT aiHilOOGC 8650 
CQ333AGCCC CCACAAOC^G AATAOGACTT GGAOZETAIA A CA1CA 11XT 8700 
CCICCfACCT GTCAGTOGOC CAO3A0GGO3 CTGGAAAGAG GLJILTACEAC 8750 
CTEACCCGIG ACCCEACAAC CQCCCID3GG AGAGO0Q0GT Q3GAGACAGC 8800 
AAGACACACT CCAGTCAATT CCTG3ZEAG3 CAACAIAATC Aiurj.ii*XC 8850 
CCACAC1GT G GG03AGGATG ATACIGASGA COCmTlL.Tr TAQ-.uiu.-lv. 8900 
ATAQCCAGG3 ATCAGCTIGA ACAQ3CICTT AACIGTGAGA TCTACOGAGC 8950 
CTGCTACICC ATAGAACCAC TGGATCTAOC TDCAAICATT CAAAGACTCC 9000 
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ATOQOCTCAG CGCATTTICA. (TICCACAGTT AC1U1CCAGG TGAAAJCAAT 9050 

AQJJ1UJC.CG CAIQOCICAG AAAACTIG33 G1LL.UXCCT T33SAGCTIG . 9100 

GAGACACCGG GCGOSGAOGG T00303CTAG QLT1U1U1U1 AGAI33AGSCA 9150 

OGGCIGCTAT A3X3IGGCAAG TAQZICTICA ACIGGGCAGT AAGAACAAAG 9200 

CTTAAACICA. CrOCAATAGC GGCCGCIGGC CGGCTO3ACT VLUllUJjr iU 9250 

GITCACGGCT GGCraCAOOS QGGGAGACAT TTATCACAGC dldl l'lUflG 9300 

OLUJ J UUCCG CIGGTICIG3 TITIGCCTAC TOCIGCIOGC T3CAGG3GTA 9350 

QQCAICIAOC 'lUJiOXCAA CCGAIGAAGG T1U335IRAA. 00X333300 9400 

TCTIfcAGCCA Tl ' lLl ' lUi ' lT TlTlTmTr TITITI'ITIT T lTl'lLTfl T 9450 

Tl ' l'lTl'lUi ' l ' TO ClTlULTr LTITITI ' IUL TITCTTITIC OCTICTTTAA 9500 

TOCJIG3CICC ATCTEAGCCC TAGTCA0G3C T3O0T3TGAA AGG1CCGTGA 9550 

GCCGCAIGAC T2CAGAGAGT GCT3AIACTG GCCICICTGC AGATCA3GT 9599 

FIG. I6F 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

MSTNPKFQRK TKKNINREFQ D7KFPQ3GQI \K3SVYLLPRR GPP2J3JBKTR 50 

KASERSQPRG PRQPIEKAERR PH3RSWAQFG YPWPLYQNB3 DGWAQWLLSP 100 

RGSRPSWGPT DPKFRSFNLG KVIDTLTOGF AEQGXIPLV GAPU33AARA 150 

IAB3VKVLED GVNYA1GNLP GCSFSIFLLA LL9CLTTPAS AYEVRNVSGI 200 

YHVINDCSNS SIVYEAADtfl MHTP3CVPCV QE3NSSRCW ALTPTLAARN 250 

AS VPl ' l ' l'lK R HVDLLVGTAA PCSAMXVGDL CGSIFLVSQL FIFSPRRHET 300 

VQDCNCSIYP GHV3GHRMAW EMMMNWSPTT ALWSQLLRI PQAVOTMVAG 350 

AHW3VLAGLA YYSMVGMrJAK VKEVALLEAG VDGEIHTIGR VAGHITSCTT 400 

SLFSSGA9QK IQLVNIN3SW HENRTATNCN DSL£fIGFTFAA LETfflHKENSS 450 

GJPEKM ASCR PIDWEAQ3WG PZEYIKPNSS DSRPYCWECiA. PRPGGWPAS 500 

QVCGPVYCFT PSPVWUl'lL) RSGVPTXSWG ENEIEVMLLN NIRPFQGNWF 550 

GCIWMNSIGF 1Kia3GfPPCN IQ3VGNFOLI CPTDCFRKHP EATYTKD3S3 600 

PWLTPPCLVD YPYRIJWHYPC TLHFSIFKVR MYVQ3VEHRL NAAQflWIRGE 650 

RCNLEDRDRS ELSPLLLSTT EWQILFCAFT TLPALSIGLI HLB2NIVDVQ 700 

YLYGVGSAFV SFAIKWEYXL IJLFLLLADAR VQ^£3JWMMLL IAQAEAALEN 750 

LWLNAASVA GAH3ILSFLV FFCAAWYIKG RIAPGAAYAF YG^PLLLLL 800 

LALPPRAYAL DTEVAASCGG WLVGLMALT LSFfYKRYIS VJZM^aIUJYFL 850 

TRVEAQLHVW VPPUSR/RGGR DAVXLLMTW HPTLVFDITK T T ,T AIFGPUaJ 900 

ILQASLUKVP YFVRVQ3LLR ICALARKEAG GHYVQMAIIK D3ALTGIYVY 950 

NHLTPLREWA HNGLRDLAVA VEPWFSRME TKLITW3ADT AAOGDIUSCL 1000 

PVSARRQQEI LiJGPADGMvS KGWRI1APIT AYAOTRGLL GCIITSLTGR 1050 

EKNQVE3EVQ IVSIATQIFL ATCINGVOtfT VYHGAGIKTI ASPKGPVE3YI 1100 

YINVDQDLVG WEAPQGSRSL TFCTCGSSDL PVRRRGDSRG 1150 

SLLSPRPISY LKGSSQGPLL CPAGHAVGLf RAAVCTRGVA KAVDFIPVEN 1200 

D3TIMRSPVF TCNSSPPAVP QSFQVAHLHA PTGSGKSTKV PAAYAAQGYK 1250 

VLVLNPSVAA TLGFGAYMSK AH3VDPNXRT GVRXTTIGSP ITYSrXGKFL 1300 

ADGQCSGGAY DIIICDECHS TDATSID3IG TVLEQAETAG ARLWLATAT 1350 

PPGSVTVSHP NIEEVALSIT GEIPFYGKAI PLEVIKQGRH KEFZHSKKKG 1400 

DELAAKLVAL GINAVAYYRG LD7SVIPT9G EWWSIDAL MIGFTGDFDS 1450 

V1UUN 1L V TQ TVDFSLDPTF TIETITLPQD AVSKIQRRGR 1GRGKP3IYR 1500 

FVAPGERPSG MFDSSVLCEC YDAGCAWYEL TPAETIVRLR AYMNTPGLPV 1550 

CQEHLEFWEG VFTGLTHIDA HELSQTKQ3G H^FPYLVAYQ ATVCARAQAP 1600 

PPSWDgy^KC LIRLKPTLB3 FIPLLYRJjGA VCNEVTLTHP ITKYIMDCMS 1650 

ADLEWTSTW VLVGGVLAAL AAYCLSIGCV VTVGRIVLSG KPAIIPEREV 1700 

LYQEFDEMEE CS(^LPYI32 GMMLAEQFKQ KADGLLOTAS RHAEVTTPAV 1750 

QINWQKLEW WAKHfcMNFIS GIQYLAGLCT LPGNPAIASL MAFTAAVTSP 1800 

LTIGQTLLFN ILGGvaJVAAQL AAPGAATAFV GAGLAGAAIG SVGLGKVLVD 1850 

ILAGYGAGVA GALVAFKH-IS GE7PSTEDLV NLJLPAILSPG ALWGWCAA 1900 

FIG. ioG 
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H77CV-J4aa Sequence 



10 20 30 40 50 

1^4567890 19.34567890 1234567890 17.34567890 1234567890 

XLRPHVGEGE GAVQMINRL2: AFASRGNHVS PTHYVPESDA AARVIAILSS 1950 

LTVIQIiLRRL H3WISSECIT PCSGSWLRDI WDWICEVLSD EKTWLKMOM 2000 

PQLPGIFFVS CQBGHRGVWR CXGIMHTRCH GGAETIGHVK 1&3IMRIVGPR 2050 

1CRNMM9GIF PINAYTIGPC TPLPAPNYKF ALWRVSAEEY VEIRRVGDFH 2100 

WSGMITENL KCPCQIPSEE FFIELEDG7RL HRFAPPCKPL LREEV5ERVG 2150 

IHEYFVG9QL PCEPEEEWAV LTSML1DPSH nAEAAGRRLr ARGSPPateS 2200 

SSASQLSAPS Ij^TCTANHD SPDAELIEMJ 1IJWRQEM3GN TIRVESENKV 2250 

VELDSFDPLV AEEDEREV5V PAEILRKSRR FARALPVWAR PDYNPPLVET 2300 

WKKPDXEPPV VK3ZELPPPR SPPVPPPRKK FTvATLTESTL SIAIAELA IK 2350 

SPGSSSISGI ILUSf l T lSSE PAPSGCPPDS EVESYSSMPP L£EEPC£>PDL 2400 

SDGSWSIVSS GADIEDWCC SMSYSWIGAL VTFCAAEEQK LPINALSN5L 2450 

LRHHNLWST TSRSAQQRQK KVIFCRD2VL DSHYQD^KE VKAAASKVKA 2500 

NLLSVEEACS LTPPHSAKSK FCTGAKDtfRC HARKAVAHIN SVWKDLLEDS 2550 

VTP .UJiT.LM A. KNEVFCVQPE KQ3TCPARLI VFPDD3VRVC EKMALYEXA/S 2600 
KLPLAVI-GSS YGP2YSPGQR VEFLVQAWKS KKTPMZFSYD TRCFDSIWE 2650 
SDIKTEEAIY QCCDLDPQAR VAIKSLTERL YVGGPLTNSR GEN2GYRRCR 2700 
ASGVLTTSQG NTLTCYIKAR AACRAAGLQD CTMLVGGCDL WICESAGVQ 2750 
EDAASLRAFT EAMIRYSAPP GDPPQPEYDL ELTT3CSSNV SVAHDGAGKR 2800 
VYYLTRDPTT PLARAAWETA FHTPVNSWLG NIIMFAPTIW ARMTLMIHFF 2850 
SVLIAPDQLE QALHZEIYGA CYSIEPLDLP PIIQRLHGLS AFSLHSYSPG 



2900 



EINRVAACLR KUGVPPLRAW RHRARSVRAR LLSP.GGRAAI CGKILFMkV 2950 

RIKLKLTPIA AAGRLDLSGW FIAGYSGGDI YHSVSHARPR WFV3FXZXLILLA 3000 

ACVGIYLLPN R 3011 

FIG. I6H 
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#1a. 3' Deletion mutants of pCV-H77C 

Sequence of 3* untranslated region of pCV-H77C 
5'UTR 



ORF 



-a'utr 



i-'43ntsi 81 nts , 101 nts 



i 



( 3' variable region; 43 nts ) 3< variable poly IHJC 3 1 conserved 

ISA (Stop codon for polyprotein) re 9 ion region region 

AGGTTGGGGT AAACACTCCG GCCT CTTAAG CCATTTCCTG 

(poly U-UC region; 81 nts) Afl 11 

TTTTTTTTTT XTTTTTTTTT TTTTTTTTCT TTTTTTTTTT CTTTCCTTTC 
CTTCTTTTTT TCCTTXCTTT TTCCCTTCTT T 

(3' conserved region; 101 nts) 

AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCAGATCATG 
T 

#1a -1. pCV-H77C(-98X) ; 3' 98 nucleotides removed from pCV-H77C 

TGA AGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAAT 

#1a -2. pCV-H77C(-42X) ; 3' 42 nucleotides removed from pCV-H77C 

TGA AGGTTGG GGTAAACACT CCGGCCT CTT AAG CCATTTC CTGTTTTTTT 

^d^c ^^^^ ^^^^^^^p^» ^ic ^r^r^r^r^r^r^nnr^r ^r^r^r^r^r^r^i^^i^^r ^r^r ^r^r c^^^^n^c*^?^^^^ 

TTTTCCTTTC TTTTTCCCTT CTTTAATGGT GGCTCCATCT TAGCCCTAGT 
CACGGCTAGC TGTGAAAGGT CCGTGAGCCG CAT 

#1a -3. pCV-H77C(X-52) ; All of the 3' UTR sequence, except 3* 49 nucleotides, 
removed from pCV-H77C 

TGAGCCGCAT GACTGCAGAG AGTGCTGATA CTGGCCTCTC T GCAGATCAT 

FIG. i7A 
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#1a -4. pCV-H77C(X) ; All of the 3* UTR sequence, except 3' 101 nucleotides, 
removed from pCV-H77C 

TGAA ATGGTG GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGCTC 
CGTGAGCCGC ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC 
ATGT . . 

#1a -5. pCV-H77C(+49X) ; The proximal 49 nucleotides of the 3' conserved 
region ( 98 nucleotides ; AAT not included) removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT rcGGCCTCTT AAGCCATTTC CTGTTTTTTT 
TTTTTTTTTT TTTTTTTTTT TCTTTTTTTT TTTCTTTCCT TTCCTTCTTT 
TTTTCCTTTC TTTTTCCCTT CTTTAATGCC GCATGACTGC AGAGAGTGCT 
GAT ACT GG C C TCTCTGCAGA TCATGT 

#1a -6. pCV-H77C(VR-24) ; First 24 nucleotides of the 3' variable region 
removed from pCV-H77C 

2GACTTAAGC CATTTCCTGT TTTTTTTTTT TTTTTTTTTT TTTTTTTCTT 
TTTTTTTTTC TTTCCTTTCC TTCTTTTTTT CCTTTCTTTT TCCCTTCTTT 
AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT 
GAGCCGCATG ACT G C AG AGA GTGCTGATAC TGGCCTCTCT G CAGAT CAT G 
T 

#la -7. pCV-H77C(-U/UC) ; Poly U-UC region removed from pCV-H77C 

TGAAGGTTGG GGTAAACACT CCGGCCTCTT AAGCCATTTC CTGAATGGTG 
GCTCCATCTT AGCCCTAGTC ACGGCTAGCT GTGAAAGGTC CGTGAGCCGC 
ATGACTGCAG AGAGTGCTGA TACTGGCCTC TCTGCAGATC ATGT 

FIG. I7B 
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#1b. Strategy of 3* Deletion mutants 

#1 b -1 . pCV-H77C(-98X) 



3' variable 
region 



poly U-UC 
region 



3' conserved 
region 

(98nts) 




Xba I* 

1. PCR Amplification 

2. Purification of PCR products. 

3. Digestion with Afl II and Xba I 

4. Cloning of Afl II /Xba I fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee ; 11/26/97 and 12/17/97 

8. Result : Negative ( No replication) 



#1 b -2. pCV-H77C(-42X) 

3' variable P°'y u ' uc 
region 

! _ 



region 



3' conserved 
region 

I (42 nts) 



Afl II (9403) 

Synthesized Oligonucleotides 



Nhe I (9530) 



Xtoal* 



1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with Nhe I and Xba I 

4. Cloning of Nhe I /Xba I fragment into pG9-KL26 (3' UTR of H77C) 

5. Sequence analysis 

6. Cloning of 3' UTR ( -42X ) [Afl II /Xba I fragment) into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee (Schedule; 1/22/98, 2/5/98 ) 
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#1b-3.pCV-H77C(X-52) 

3* variable poly U-UC 
region region 
ITGAI I 



i 



NS5B 



3' conserved 
region 

| (52nts) j (49nts) | 



Njde\ (9160) 



Svnthesized'Oliaonucleotides 



Pfu PCR 



Xba I* 



^ Fusion and Extension 



■itniiii' 

■ ilium ' 



1. Fragment a ; Pfu PCR amplification and purification 

2. Fragment b ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning Nde \-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 
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#1b-4.pCV-H77C(X) 

3' variable poly U-UC 

region region 
ITGAI I 



I 



NS5B 



▲ 

AWel (9160) 
a ' , 




EUlESB 



3' conserved 
region 
(101 ills) 



Synthesized Oligonucleotides 



A 

Xbal* 



I 



Fusion and Extension 



■iitiitm - 
minim i 



1. Fragment a ; Pfu PCR amplification and purification 
2- Fragment c ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysts 

6. Cloning Nde l-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 



FIG. I7E 
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#1b -5. pCV-H77C(+49X) 

3* variable 
region 



poly U-UC 
region 



3' conserved 
region 

(49 nts) 




AflU (9403) 



;,.,„ Synthesized Oligonucleotides 

i l 
I • 

e 



linn i 



i 



XbaV 



Fusion and Extension 



1. Fragment d ; Pfu PCR amplification and purification 

2. Fragment e ; Synthesized oligonucleotides (anti-sense) 

3. Fusion and extension 

4. TA cloning 

5. Sequence analysis 

6. Cloning AilU-Xba I fragment with correct sequence into pCV-H77C 

7. Complete sequence analysis 

8. In vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 
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#1b -6. pCV-H77C(VR-24) 

3* variable 

region 

TGAl(24nts)l | 



NS5B 



poly U-UC 
region 



3' conserved 
region 



EQB. 



Afdel (9160) 



A/7 II (9403) 



* vw/ir 
— ► «-T" 

1. PCR Amplification 

2. Purification of PCR products 

3. Digestion with Nde I and Afl I 

4. Cloning of A/o*e I /A// II fragment into pCV-H77C 

5. Complete sequence analysis 

6. in vitro transcription (within 24 hours of inoculation) 

7. Percutaneous intrahepatic transfection into chimpanzee 



#1b -7. pCV-H77C(-U/UC) 

3' variable 
region 

I 



poly U-UC 
region 



3' conserved 
region 
Pints) i 



Nhe I (9530) 



Afl II (9403) 
Synthesized Oligonucleotides !!"!!! 

AflU 

1. Synthesis of oligonucleotides ( sense and anti-sense ) 

2. Hybridization of oligonucleotides 

3. Digestion with AflU and Nhe I 

4. Cloning of Afl II and Nhe I fragment into pG9-KL26 

5. Sequence analysis 

6. Cloning of 3' UTR ( -poly U-UC ) [Afl II IXba I fragment] into pCV-H77C 

7. Complete sequence analysis 

8. in vitro transcription (within 24 hours of inoculation) 

9. Percutaneous intrahepatic transfection into chimpanzee 
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FIG. I8A 
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HCV RNA 

□■■■■■I ■■■■■■■■■■■■■ ■■■■■■■ 

Liver Histology 

O OQO O 0» OOO O 003 3 <*» O »• 



Anti-HCV 




Week Postinoculotion 
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